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(57) Abstract: The invention provides 
a system (100) and methods for in -situ 
identification of one or more regions of 
tissue (138) at which thee is a likelihood 
of disease. The invention generally 
relates to methods and devices for 
acquiring, analyzing, processing, and 
displaying optica} data and diagnostic 
results from a patient sample. For 
example, methods of the invention 
comprise obtaining spectral and visual 
data from a patient sample, calibrating 
the data, compensating for sample 
motion, arbitrating between redundant 
data sets, identifying potentially 
non-representative data, analyzing 
the data, and displaying the diagnosic 
results. The invention provides the 
option of real-time data processing and 
diagnosis. 
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METHODS AND APPARATUS FOR 
CHARACTERIZATION OF TISSUE SAMPLES 



RELATED APPLICATIONS 

[0001] The present application claims the benefit of U.S. Patent Application Serial Number 
10/243,535, filed September 13, 2002, and U.S. Provisional Patent Application Serial Number 
60/394,696, filed July 9, 2002. Additionally, the present application claims the benefit of the 
5 following commonly-owned applications: U.S. Patent Application Serial Number 10/41 8,41 5; 
U.S. Patent Application Serial Number 10/418,668; U.S. Patent Application Serial Number 
10/418,902; U.S. Pateiit Application Serial Number 10/418,922; U.S. Patent Application Serial 
Number 10/418,973; U.S. Patent Application Serial Number 10/418,974; U.S. Patent 
Application Serial Number 10/418,975; and U.S. Patent Application Serial Number 10/419,181, 
10 all of which were filed on April 18, 2003. 
FIELD OF THE INVENTION 

[0002] This invention relates generally to image processing and spectroscopic methods. More 
particularly, in certain embodiments, the invention relates to the diagnosis of disease in tissue 
using spectral analysis and/or image analysis. 

15 BACKGROUND OF THE INVENTION 

[0003] It is common in the field of medicine to perform visual examination to diagnose 
disease. For example, visual examination of the cervix can discern areas where there is a 
suspicion of pathology. However, direct visual observation alone may be inadequate for proper 
identification of an abnormal tissue sample, particularly in the early stages of disease. 

20 [0004] In some procedures, such as colposcopic examinations, a chemical agent, such as acetic 
acid, is applied to enhance the differences in appearance between normal and pathological tissue. 
Such acetowhitening techniques may aid a colposcopist in the determination of areas in which 
there is a suspicion of pathology. 

[0005] Colposcopic techniques are not perfect. They generally require analysis by a highly- 
25 trained physician. Colposcopic images may contain complex and confusing patterns and may be 
affected by glare, shadow, or the presence of blood or other obstruction, rendering an 
indeterminate diagnosis. 



WO 2004/005895 



PCT/US2003/021347 



-2- 

[0006] Spectral analysis has increasingly been used to diagnose disease in tissue. Spectral 
analysis is based on the principle that the intensity of light that is transmitted from an illuminated 
tissue sample may indicate the state of health of the tissue. As in colposcopy examination, 
spectral analysis of tissue may be conducted using a contrast agent such as acetic acid. In 
5 spectral analysis, the contrast agent is used to enhance differences in the light that is transmitted 
from normal and pathological tissues. 

[0007] Spectral analysis offers the prospect of at least partially-automated diagnosis of tissue 
using a classification algorithm. However, examinations using spectral analysis may be 
adversely affected by glare, shadow, or the presence of blood or other obstruction, rendering an 
10 indeterminate diagnosis. Some artifacts may not be detectable by analysis of the spectral data 
alone; hence, erroneous spectral data may be inseparable from valid spectral data. Also, the 
surface of a tissue sample under spectral examination is generally not homogeneous. Areas of 
disease may be interspersed among neighboring healthy tissue, rendering overly-diffuse spectral 
data erroneous. 

15 [0008] A typical tissue classification algorithm applies a single statistical technique to 
determine the probability that data from a tissue sample falls within a certain predetermined 
class. The result may be inaccurate, and may vary depending on the assumptions of the 
statistical technique applied. Furthermore, examinations using spectral analysis may be 
adversely affected by glare, shadow, or the presence of blood or other obstruction, rendering 

20 inaccurate tissue-class probabilities. 

[0009] Current methods of displaying data based on tissue classification algorithms do not 
facilitate quick, accurate, or clear communication of diagnostic results. Current techniques 
generally require interpretation by a skilled medical professional for meaningful and accurate 
conveyance of diagnostic information, due, in part, to the unfiltered nature of the diagnostic data. 

25 [0010] Current methods of calibrating spectral data acquisition systems do not provide 
sufficient accuracy or repeatability needed for pinpoint tissue diagnosis. Current calibration 
methods do not adequately account for stray light effects, chromatic aberrations, and spatial 
inhomogeneities. Furthermore, areas of disease may be interspersed among neighboring healthy 
tissue, rendering overly-diffuse, improperly-calibrated spectral data. 

30 [0011] Current focusing methods generally do not provide sufficiently accurate levels of focus 
for acquiring diagnostic optical data from a tissue sample. High quality focus is necessary to 
provide data with sufficiently low noise. Tissue surface roughness, as well as obstructions such 
as glare, shadow, and blood, make achieving adequate focus difficult. Even where adequate 
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focus levels may be achieved, current focusing methods are generally not fast enough to allow 
acquisition of diagnostically-relevant optical data. Focusing speed is important, for example, in 
optical analysis of an acetowhitening examination, since spectral data must be obtained within a 
finite period of time following application of acetic acid to the tissue. Furthermore, current 
5 focusing techniques are not sufficiently robust such that consistent focus levels are achieved over 
the lifetime of the optical instrument. 

[0012] It is important that elements in tissue sample images be easily discernible for purposes 
of diagnosis. Brightness variations, tissue surface variations, and obstructions such as glare, 
shadow, and blood, can make diagnosis difficult. Current image processing techniques may be 
10 used to improve images of tissue samples to facilitate diagnosis. However, current techniques 
are often inadequate, since image adjustments are generally based on the entire image, including 
portions which are of no diagnostic interest. 

[0013] Thus, there exists a need to improve the accuracy with which regions of interest of a 
tissue sample are identified, and with which the condition of those regions is classified. There 

15 exists a need for an improved method of determining tissue-class probabilities for a tissue 

sample. There exists a general need for more accurate spectral analysis methods for diagnosing 
tissue; more specifically, there is a need to reduce the inaccuracy of tissue classification 
algorithms due to erroneous spectral data. There exists a need to improve the ease, accuracy, and 
clarity with which diagnostic data are displayed. There exists a general need for more accurate 

20 and more precise calibration methods for spectral data acquisition systems. There exists a need 
to improve focusing accuracy, speed, and robustness in optical systems that acquire diagnostic 
optical data. There exists a need to improve methods of enhancing tissue sample images for 
diagnostic purposes. 
SUMMARY OF THE INVENTION 

25 Characterization of Tissae Samples 

[0014] The invention provides a system and methods for in-situ identification of one or more 
regions of tissue at which there is a likelihood of disease. The invention generally relates to 
methods and devices for acquiring, analyzing, processing, and displaying optical data and results 
obtained from a patient sample. For example, methods of the invention comprise obtaining 

30 spectral and visual data, calibrating the data, compensating for sample motion, arbitrating 

between redundant data sets, identifying potentially non-representative data, analyzing the data, 
and displaying the results. The invention provides the option of real-time spectral and image 
data processing. 
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[0015] The invention achieves greater diagnostic accuracy, in part, by properly identifying and 
accounting for data from regions that are affected by an obstruction and/or regions that lie 
outside a diagnostic zone of interest. A region of a tissue sample may be obstructed, for 
example, by mucus, fluid, foam, a portion of a speculum or other medical instrument, glare, 
shadow, and/or blood. Regions that lie outside a zone of interest include, for example, a vaginal 
wall, an'os, a cervical edge, and tissue in the vicinity of a smoke tube. Obstructed and outlier 
regions are those from which optical data are ambiguous or cannot be classified. Once data from 
the obstructed regions and regions outside a zone of interest are identified, they are processed by 
either elimination (hard masking) or by weighting (soft masking) in a tissue classification 
algorithm. The weighting may indicate the likelihood that data are actually obtained from an 
obstructed or outlier region. 

[0016] Data masking algorithms of the invention automatically identify data from regions that 
are obstructed and regions that He outside a zone of interest of the tissue sample. Some of the 
masks of the invention use spectral data, other masks use image data, and still other masks use 
both spectral and image data from a region in order to determine whether the region is obstructed 
and/or lies outside the zone of interest. The invention provides greater diagnostic accuracy by 
automatically masking data that might otherwise give rise to a false diagnosis. 
[0017] In addition, the invention provides methods of obtaining and arbitrating between 
redundant sets of certain types of data obtained from the same region of tissue. For example, one 
embodiment comprises obtaining two sets of reflectance spectral data from the same region, 
where each set is obtained using tight incident to the region at a different angle. In this way, if 
one set of data is affected by an artifact, such as glare, shadow, or other obstruction, the other set 
of data provides a back-up that may not be affected by the artifact. The invention comprises 
methods of automatically detennining whether one or more data sets is/are affected by an 
artifact, and provides methods of arbitrating between the multiple data sets in order to select a 
representative set of data for the region. 

[0018] The invention offers increased diagnostic sensitivity and specificity by combining a 
plurality of statistical classification techniques to determine tissue-class probabilities for a given 
region of a tissue sample. Furthermore, in one embodiment, the invention comprises combining 
one or more statistical classification techniques with one or more non-statistical approaches. 
[0019] Tissue diagnostic information, especially relating to the disease state of the tissue, may 
not be determinable using only statistical approaches. For example, optical data obtained from a 
tissue sample may indicate levels of substances - such as collagen, porphyrin, FAD, and/or 
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NADH - which may be related to a tissue classification. In those cases, non-statistically-derived 
information may be taken into account by applying a classification metric that is used with one 
or more statistical classification schemes, as part of the overall processing of data. Alternatively 
or additionally, the overall processing scheme includes analyzing image data, such as 
5 acetowhitening kinetic data, to determine tissue-class probabilities. The effectiveness of such 
techniques is further increased when coupled with the data masking techniques introduced 
above. 

[0020] Soft or hard masks may be applied in the present invention in order to obtain a 
probability of a specific tissue condition. For example, processing of optical data in connection 

10 with the application of a necrosis mask may provide a probability that a specific region of tissue 
is necrotic. The masking parameters may be set such that the result is binary (i.e., the tissue- 
class probability is either 0 or 1 .0). Thus, the result of masking may itself be an expression of a 
tissue-class probability, and may encompass a data processing step according to the invention. 
[0021] Systems of the invention allow performing fast and accurate image and spectral scans 

15 of tissue, such that both image and spectral data are obtained from each of a plurality of regions 
of the tissue sample. Each data point is keyed to its respective region, and the data are used to 
characterize the condition of each of the regions of interest. In one embodiment, spectral and 
image data are acquired from a tissue sample over an approximately 1 0 to 15 second interval of 
time. In other embodiments, the scanning time may be longer or shorter. 

20 10022] Small patient movements, such as those due to breathing, may adversely affect how 
certain spectral and image data are keyed to regions of the tissue sample. Thus, the invention 
comprises compensating for image misalignment caused by patient movement during data 
acquisition. Furthermore, validating misalignment corrections improves the accuracy of 
diagnostic procedures that utilize data obtained over an interval of time, particularly where the 

25 misalignments are small and the need for accuracy is great. Methods of the invention may be 
performed in real time by determining misalignment corrections, validating them, and adjusting 
for them at the same time that optical data are being obtained. 

[0023] Accordingly, the invention comprises obtaining both spectral and image data from one 
or more regions of a tissue sample, arbitrating between redundant data sets obtained from each 
30 region, automatically masking the data to identify regions that are outside a zone of interest or 
are affected by an obstruction, processing spectral data using one or more statistical 
classification techniques and one or more metrics having a non-statistically-based component, 
and characterizing a condition of each region according to the classification and masking results. 
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Methods of the invention preferentially are carried out using an optical detection device adapted 
to obtain spectral data from a plurality of regions of a tissue sample. Such a device also 
comprises a memory that stores code defining a set of instructions, and a processor that executes 
the instructions to perform a method of determining a condition of each of one or more of the 

5 regions. In one embodiment, the method includes identifying spectral data obtained from 
substantially unobstructed regions of the tissue sample within a zone of interest, determining 
tissue-class probabilities using the identified spectral data, and determining a condition of one or 
more regions using the tissue-class probabilities. The identifying step may include image 
masking, spectral masking, or both. In some instances, characterizing a condition of a region 

10 means using the masking result to characterize the region as indeterminate, thereby trumping the 
classification result 

Determination of Tissue-Class Probabilities 

[0024J Th e invention provides methods for determining a tissue-class probability of a region of 
a tissue sample. A tissue-class probability is a probability that a given region of a tissue sample 

15 contains tissue of a predetermined type, such as CIN 1 (cervical intraepithelial neoplasia, grade 
1), CIN 2/3 (cervical intraepithelial neoplasia grades 2 and/or 3), normal squamous, normal 
columnar, and metaplasia, for example. Tissue-class probabilities are useful in characterizing 
the condition (e.g., disease state, response to treatment, cell type, etc.) of a tissue. 
[0025] The invention provides increased diagnostic sensitivity and specificity by combining a 

20 plurality of statistical classification techniques to determine tissue-class probabilities for a tissue 
sample. Furthermore, in one embodiment, the invention comprises combining one or more 
statistical classification techniques with one or more non-statistical approaches in order to 
determine a condition of a tissue sample. 

[0026] The invention provides increased diagnostic accuracy by applying two or more 
25 statistical classification techniques to data from a region of tissue. The two or more techniques 
may use different input data from the region. For example, reflectance data from a region 
corresponding to a first wavelength range may be used to determine a first set of tissue-class 
probabilities, while data corresponding to a second wavelength range, different from the first, 
may be used to determine a second set of tissue-class probabilities. Then, the invention 
30 comprises determining a set of overall tissue-class probabilities based on the first and second sets 
of tissue-class probabilities. 

[0027] In another embodiment, the two or more techniques differ in that they have different 
statistical bases. For example, one embodiment of the invention comprises determining a first 
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set of tissue-class probabilities by applying a statistical method based on maximal variance of 
data between known classes, and determining a second set of tissue-class probabilities by 
applying a statistical method based on maximal discrimination of data between known classes. 
Overall tissue-class probabilities are then computed using the two sets of probabilities resulting 
5 from the two statistical methods. 

[0028] Tissue diagnostic information, especially relating to the disease state of the tissue, may 
not be determinable using only statistical approaches. For example, optical data obtained from a 
tissue sample may indicate levels of substances — such as collagen, porphyrin, FAD, and/or 
NADH - which may be related to a tissue classification. In those cases, non-statistically-derived 

10 information may be taken into account by applying a classification metric that is used with one 
or more statistical classification schemes, as part of the overall processing of data. Accuracy 
may be increased further still by application of data masking algorithms. 
[0029] Data masking algorithms of the invention automatically identify data from regions that 
are obstructed and regions that lie outside a zone of interest of the tissue sample. Some of the 

15 masks of the invention use spectral data, other masks use image data, and still other masks use 
both spectral and image data from a region in order to determine whether the region is obstructed 
and/or lies outside a zone of interest. A region of a tissue sample may be obstructed, for 
example, by mucus, fluid, foam, a portion of a speculum or other medical instrument, glare, 
shadow, and/or blood. Regions that lie outside a zone of interest include, for example, a vaginal 

20 wall, an os, a cervical edge, and tissue in the vicinity of a smoke tube. Generally, obstructed and 
outlier regions are those from which optical data are ambiguous or cannot be classified. 
[0030] The invention provides greater diagnostic accuracy by automatically masking data that 
might otherwise result in erroneous tissue-class probabilities. For example, data from regions 
identified as obstructed or outside a zone of interest may be "hard masked" - that is, eliminated 

25 prior to computation of tissue-class probabilities. These regions may be characterized as having 
an indeterminate condition. 

[0031] In some cases, data from regions that are only partially obstructed or which lie only 
partially outside a zone of interest are still used to determine tissue-class probabilities. These 
probabilities may be "soft masked" - that is, weighted according to a likelihood a point within 
30 the region is affected by an obstruction and/or lies outside a zone of interest. 

[0032] Soft or hard masks may be applied in the present invention in order to obtain a 
probability of a specific tissue condition. For example, processing of optical data in connection 
with the application of a necrosis mask may provide a probability that a specific region of tissue 
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is necrotic. The masking parameters may be set such that the result is binary (i.e., the tissue- 
class probability is either 0 or 1 .0). Thus, the result of masking may itself be an expression of a 
tissue-class probability, and may encompass a data processing step according to the invention. 
[0033] In addition, the invention provides methods of obtaining and arbitrating between 
5 redundant sets of data obtained from a tissue sample. For example, one embodiment comprises 
obtaining two sets of reflectance spectral data from the same region of a tissue sample, wherein 
each set is obtained using light incident to the region at a different angle. In this way, if one set 
of data is affected by an artifact, such as glare, shadow, or other obstruction, the other set of data 
provides a back-up that may not be affected by the artifact. The invention comprises methods of 
10 automatically determining whether one or more data sets is/are affected by an artifact, and 

provides methods of arbitrating between the multiple data sets in order to select a representative 
set of data for the region. 

[0034] Accordingly the invention comprises obtaining both spectral and image data from one 
or more regions of a tissue sample, arbitrating between redundant data sets obtained from each 

15 region, automatically masking the data to identify regions that are outside a zone of interest or 
are affected by an obstruction, and processing the data using a plurality of statistical tissue 
classification techniques to determine, for each member of a set of predefined tissue classes, a 
probability that the region comprises tissue within the predefined class. Methods of the 
invention also comprise evaluating a classification metric having a non-statistically-based 

20 component, and characterizing a condition of the region according to either the classification 
metric (if satisfied) or the set of tissue-class probabilities. 
Spectral Masking 

[0035] The invention provides methods for processing tissue-derived spectral data for use in a 
classification algorithm. Methods of the invention comprise application of spectral and/or image 

25 masks for separating ambiguous or unclassifiable spectral data from valid spectral data. 

[0036] More specifically, the invention improves the accuracy of tissue classification, in part, 
by properly identifying and accounting for spectral data that are non-representative of a zone of 
interest of a tissue sample, for example, spectral data from tissue regions that are affected by an 
obstruction and/or regions that lie outside a diagnostic zone of interest. During examination of 

30 tissue, a portion of the tissue may be obstructed, for example, by mucus, fluid, foam, a medical 
instrument, glare, shadow, and/or blood. Moreover, tissue examination may include data from 
portions of the tissue that lie outside an identified zone of interest. Regions that lie outside a 
zone of interest include, for example, a tissue wall (e.g., a vaginal wall), an os, an edge surface of 
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a tissue (e.g., a cervical edge), and tissue in the vicinity of a smoke tube. Once data from the 
obstructed regions and regions outside a zone of interest are identified, they are processed by 
either elimination (hard masking) or by weighting (soft masking) in a tissue classification 
algorithm. 

5 [0037] Therefore, a preferred method of the invention comprises applying spectral masks to 
automatically identify data from regions of a tissue sample that are obstructed or lie outside a 
zone of interest. Regions from which such data are obtained are then identified and 
characterized as being indeterminate. Data from these regions may then be eliminated from 
further processing in the tissue classification algorithm. 

10 [0038] In some cases, data from a region that is only partially obstructed or which lies only 
partially outside a zone of interest are still used in a tissue classification scheme, for example, to 
determine tissue-class probabilities. Those probabilities may be "soft masked" - that is, 
weighted according to*a likelihood the region (or a point within the region) is affected by an 
obstruction and/or lies outside a zone of interest. 

15 [0039] The invention also provides methods of processing spectral data by applying spectral 
masks in conjunction with image masks. Image masks are similar to spectral masks, except that 
image masks are based on image data from the tissue sample - for example, luminescence or 
RGB intensities. Methods of the invention comprise determining an overlap between regions 
identified by a spectral mask and those identified by an image mask. Data from the area of 

20 overlap are classified as indeterminate or are appropriately weighted, according to the tissue 
classification algorithm. 

[0040] A spectral mask as applied in the present invention may take the form of a metric. A 
metric may include a series of logical statements, each comparing a single spectral measurement 
or a combination of spectral measurements obtained from a given region of a tissue sample to a 
25 threshold value. If the metric is satisfied, the corresponding region is considered to be 
"masked." 

[0041] In addition to filtering erroneous data, spectral masks can be used to identify regions of 
tissue having a predetermined condition (e.g., disease state, response to treatment, cell type, etc). 
For example, the invention provides a method of identifying healthy tissue by evaluating a 
30 metric based at least in part on two ratios of spectral data obtained from a tissue sample. The 
invention also provides a method of identifying necrotic tissue by evaluating a metric based on 
fluorescence data obtained from the tissue sample. 
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[0042] In addition, the invention provides methods of obtaining and arbitrating between 
redundant sets of spectral data obtained from the same region of tissue. For example, one 
embodiment comprises obtaining two sets of reflectance spectral data from the same region, 
wherein each set is obtained using light incident to the region at a different angle. In this way, if 
5 one set of data is affected by an artifact, such as glare, shadow, or other obstruction, the other set 
of data provides a back-up that may not be affected by the artifact. The invention comprises 
methods of automatically determining whether one or more data sets is/are affected by an 
artifact, and provides methods of arbitrating between the multiple data sets in order to select a 
representative set of data for the region. The representative set of data may then be processed 

10 according to the various embodiments of the invention, as introduced above. 

[0043] Accordingly, the invention comprises obtaining spectral data from one or more regions 
of a tissue sample, arbitrating between redundant data sets obtained from each region, 
automatically masking the data to identify regions that are outside a zone of interest or are 
affected by an obstruction, and processing spectral data from the identified regions in a tissue 

15 classification scheme. Methods of the invention also comprise identifying healthy tissue and 
necrotic tissue by evaluating metrics based on spectral data obtained from a tissue sample. 
Image Masking 

[0044] The invention provides methods for processing tissue-derived optical data for use in a 
classification algorithm. Methods of the invention comprise application of image masks for 

20 identifying ambiguous or unclassifiable optical data. The optical data may comprise, for 

example, spectral data and/or acetowhitening kinetic data used in a tissue classification scheme. 
[0045] In one aspect, the invention improves the accuracy of tissue classification by properly 
identifying and accounting for optical data that are not representative of a zone of interest of a 
tissue sample. Such non-representative data include, for example, data from tissue regions that 

25 are affected by an obstruction and/or regions that lie outside a diagnostic zone of interest. 

During examination of tissue, a portion of the tissue may be obstructed, for example, by mucus, 
fluid, foam, a medical instrument, glare, shadow, and/or blood. Moreover, tissue examination 
may include data from portions of the tissue sample that lie outside an identified zone of interest. 
Regions that lie outside a zone of interest include, for example, a tissue wall (e.g., a vaginal 

30 wall), an os, an edge surface of a tissue (e.g., a cervical edge), tissue in the vicinity of a smoke 
tube, and non-tissue portions of the sample. Once data from the obstructed regions and regions 
outside a zone of interest are identified, they are processed by either elimination (hard masking) 
or by weighting (soft masking) in a tissue classification algorithm. 
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[0046] Therefore, a preferred method of the invention comprises applying image masks to 
automatically identify data from regions of a tissue sample that are obstructed or that lie outside 
a zone of interest. Regions from which such data are obtained are then identified and 
characterized as being indeterminate. Optical data from these regions may be disqualified from 

5 further use in the tissue classification algorithm. 

[0047] In some cases, optical data from a region that is only partially obstructed or that lies 
only partially outside a zone of interest are still used in a tissue classification scheme, for 
example, to determine tissue-class probabilities. Those probabilities may be "soft masked" - 
that is, weighted according to a likelihood the region (or a point within the region) is affected by 

10 an obstruction and/or lies outside a zone of interest. 

[0048] The invention may also comprise applying image masks to identify regions of a tissue 
sample providing superior tissue classification data. In this case, soft masking of optical data 
from identified regions affords them greater weight in the tissue classification algorithm, 
compared with data from other regions. 

15 [0049] An image mask as applied in the present invention may comprise a combination of 
image processing steps designed to isolate a particular feature of a tissue sample. Exemplary 
image masks presented herein include a blood mask, a mucus mask, a speculum mask, a pooled 
fluid and foam mask, a glare mask, an os mask, a smoke tube mask, a vaginal wall mask, and a 
region-of-interest mask. The area of a tissue sample identified by an image mask is considered 

20 to be "masked." The masked area may be represented as ones or zeros in a binary image, or, 
alternatively, the masked areas may simply be represented as a set of points or pixels. 
[0050] An image mask of the invention may operate on a complete image of the tissue sample, 
or on parts of the image. For example, the invention provides a glare mask which is applied by 
dividing an image into blocks, determining a histogram for one or more of the blocks, and 

25 computing thresholds for each block based on its histogram. This compensates for variations in 
overall brightness levels in the image when computing intensity thresholds indicative of glare. 
[0051] In one embodiment, the invention comprises applying an image mask by determining 
one or more intermediate images before computing a final binary image. For example, the 
invention comprises applying a vaginal wall mask by determining a gradient image of the tissue 

30 sample, determining a skeletonized image from the gradient image, and performing edge linking 
and edge extension to obtain a final binary image mask. 

[0052] Image masking techniques of the invention work particularly well when applied in 
tissue classification schemes which use spectral data For example, tissue classification based on 
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a principal component analysis method or a feature coordinate extraction method produces more 
accurate results when input spectral data are processed via image masking. Accuracy may be 
further increased by employing a tissue classification scheme based on both a principal 
component analysis method and a feature coordinate extraction method. 

5 [0053] Accordingly, the invention provides methods of performing fast and accurate image 
and spectral scans of the tissue, such that both image and spectral data are obtained from each of 
a plurality of regions of the tissue sample. Each data point is keyed to its respective region, and 
the data are used to characterize the condition of each of the regions of interest. In one 
embodiment, spectral and image data are acquired from a tissue sample over an approximately 

10 10 to 1 5 second interval of time. In other embodiments, the scanning time may be longer or 
shorter. 

[0054J Small patient movements, such as those due to breathing, may adversely affect how 
certain spectral and infage data are keyed to regions of the tissue sample. Thus, the invention 
comprises compensating for image misalignment caused by patient movement during data 

15 acquisition. Furthermore, validating misalignment corrections improves the accuracy of 

diagnostic procedures that utilize data obtained over an interval of time, particularly where the 
misalignments are small and the need for accuracy is great. Methods of the invention may be 
performed in real time by determining misahgnment corrections, validating them, and adjusting 
for them at the same time that optical data are being obtained. 

20 [0055J Thus, the invention comprises providing image data from an area of a tissue sample, 
applying image masks to identify regions of the tissue that are outside a zone of interest or are 
affected by an obstruction, and processing optical data from the identified regions in a tissue 
classification scheme. The step of providing image data may comprise the physical act of 
obtaining a video image of the tissue sample. Alternatively, simply supplying image data 

25 otherwise obtained from the tissue sample may encompass the providing step according to the 
invention. 

Displa ying Diagnostic Data 

[0056] The invention provides methods for displaying diagnostic results obtained from a tissue 
sample. In general, the invention assigns tissue-class probability values to discrete regions of a 
30 patient sample, and creates an overlay for displaying the results. One feature of the overlay is 
that it facilitates display of the tissue class probabilities in a way that reflects the diagnostic 
relevance of the data. For example, methods of the invention comprise applying filtering and 
color-blending techniques in order to facilitate display of diagnostic results. Those techniques 
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enhance certain portions of the overlay in order to highlight diagnostically-relevant regions of 
the sample. 

[0057] Further increases in diagnostic relevance are obtained when the overlay is viewed as a 
composite that includes a reference image of the sample. For example, preferred methods of the 
5 invention represent a range of tissue-class probabilities as a spectral blend between two colors 
that contrast with an average tissue color. In one embodiment, a portion of the spectrum 
representing low probability of disease is blended with an average tissue color so that tissue 
regions associated with a low probability of disease are featured less prominently in the 
composite. 

1 0 [0058] Preferred embodiments of the invention comprise application of diagnostic data that 
properly account for indeterminate regions of a tissue sample. A region may be diagnosed as 
indeterminate if it is affected by an obstruction or if it lies outside a zone of diagnostic interest. 
A region of a tissue sample may be obstructed, for example, by mucus, fluid, foam, a medical 
instrument, glare, shadow, and/or blood. Regions that lie outside a zone of interest include, for 

15 example, a tissue wall (e.g., a vaginal wall), an os, an edge surface of a tissue (e.g., a cervical 
edge), and tissue in the vicinity of a smoke tube. Data masking algorithms of the invention 
automatically identify data from regions that are obstructed and regions that lie outside a zone of 
interest based on spectral data obtained from those regions. In one embodiment, the overlay 
identifies indeterminate regions without obscuring corresponding portions of the reference 

20 image, when viewed as a composite. Similarly, necrotic regions may be indicated on the 
overlay, according to results of necrotic data masking algorithms of the invention. 
[0059] Systems of the invention allow performing fast and accurate image and spectral scans 
of tissue, such that both image and spectral data are obtained from each of a plurality of regions 
of the tissue sample. Each data point is keyed to its respective region, and the data are used to 

25 determine tissue-class probabilities for regions of interest, as well as to identify indeterminate 
regions. These systems allow real-time display of diagnostic results during a patient 
examination. For example, data may be obtained from an in vivo tissue sample, and results of a 
tissue classification algorithm may be displayed either during or immediately following the 
examination. This provides a medical professional with nearly instantaneous, feedback which 

30 may be quickly comprehended and used for continued or follow-up examination. In some cases, 
the display is prepared within seconds of obtaining data from the tissue. In other cases, the 
display is ready within a matter of minutes or a matter of one or more hours after obtaining data 
from the tissue. 
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[0060] Accordingly, the invention comprises providing tissue-class probabilities corresponding 
to regions of a tissue sample, creating an overlay that uses color to key the probability values to 
the corresponding regions, and displaying a composite of a reference image of the tissue sample 
with the overlay. Methods of the invention preferentially include color-blending and/or filtering 

5 techniques designed to convey diagnostically-relevant data in a manner commensurate with the 
relevance of the data. In one embodiment, methods of the invention are performed such that 
diagnostic results are displayed in real-time during a patient examination. The step of providing 
tissue-class probabilities may comprise actual determination of diagnostic results according to 
methods of the invention. Alternatively, simply supplying probability values obtained using any 

10 tissue classification method may encompass the providing step according to the invention. 
Calibrating Spectral Data 

[0061] The invention provides methods for calibrating spectral data acquisition systems. 
These calibration methods produce spectral data sufficiently accurate for use in tissue 
classification algorithms. More specifically, the invention improves the accuracy of spectral- 
15 based tissue classification schemes, in part, by properly accounting for spatial variations, 
instrument-to-instrument variations, and patient-to-patient variations in the acquisition of 
spectral data from tissue samples. 

[0062] The invention provides systems for diagnoses of tissue samples. A single spectral scan 
may consist of the acquisition of spectral data from each of about 500 regions of a tissue sample 

20 with centers spaced about 1.1 mm apart. Other spacings may be used with the same, fewer, or 
more regions. Proper calibration is necessary to provide a baseline standard for the test spectral 
data obtained, as well as for the reference spectral data upon which the diagnostic classification 
schemes are based. Differences in spectral data from a tissue sample should be attributable to 
the tissue itself, not to baseline variations. Baseline variations may be caused, for example, by 

25 stray light effects, electronic background effects, variation in light energy delivered to a tissue 
sample, spatial heterogeneities of the illumination source, chromatic aberrations of the scanning 
optics, variation in wavelength response of the collection optics, and the efficiency of the 
collection optics. 

[0063] Therefore, a preferred method of the invention comprises obtaining calibration data 
30 from spaced-apart locations on a reference target, wherein the locations are keyed to locations of 
a tissue sample from which spectral data are subsequently obtained. The reference target may 
be, for example, a solid target having a known reflectance, a fluorescent dye-filled target, a 
disposable target placed directly onto a tissue sample, or an "open air" target. Moreover, a 
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preferred method of the invention comprises processing spectral data from a tissue sample using 
calibration data obtained as part of the routine preventive maintenance of an optical instrument, 
as well as using calibration data obtained under test conditions just prior to a patient scan. An 
initial, "factory" calibration of the instrument may be performed additionally, or in place of the 
5 preventive maintenance calibration. A calibration baseline may thus be obtained at each region 
of a tissue sample using calibration data obtained under tightly controlled preventive 
maintenance and/or factory conditions, as well as using calibration data obtained under pre- 
patient conditions, which more closely approximate actual test conditions. The preventive 
maintenance and/or factory calibrations account for instrument-to-instrument variability, which 

10 is particularly important in building a reference spectral database containing data obtained from a 
number of individual instruments. Thus, methods of the invention comprise adjusting for 
individual instrument response and correcting for temporal, patient-to-patient variability. 
[0064] Before obtaining a spectral scan during a patient exam, a single-use disposable sheath 
may be placed on a part of the optical instrument that comes into contact with patient tissue. The 

15 disposable sheath may affect the spectral data obtained. Thus, one embodiment of the invention 
comprises obtaining calibration data in a pre-patient test performed with a disposable sheath in 
place, under patient scan conditions. In order to account for instrument-to-instrument variability, 
the embodiment further comprises obtaining factory and/or preventive maintenance calibration 
data with a disposable sheath in place, as in the pre-patient test. The disposables used in the 

20 factory and/or preventive maintenance tests are of the same type as the disposables used in the 
patient scans, but they are typically not the same individual disposables, which should generally 
be maintained under sterile conditions. This difference is accounted for using calibration 
algorithms of the invention. 

[0065] Furthermore, methods of the invention provide procedures for dealing with internal 
25 stray light effects. Internal stray light includes cross-talk between transmitted light and 
collection optics of a spectral data acquisition system. Typically, light is produced and 
transmitted internally within an optical instrument. The internal light is generally shielded from 
the collection optics so that light collected from the sample is not thereby contaminated. 
However, there is generally some internal stray light that affects spectral readings from tissue 
30 samples. Preferred methods of the invention comprise correcting for internal stray light by 
obtaining spaced-apart data from a "null" target of sufficiently low reflectance so as to yield a 
residual optical signal under patient scan conditions. Preferred methods additionally comprise 
obtaining spaced-apart data from a factory or preventive-maintenance test performed in "open 
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air" (with no target) and in the absence of external light (e.g., in a darkroom). Then, spectral 
data obtained from a patient scan are calibrated using data from the "null target" and "open air" 
tests. 

[0066] Accordingly, the invention comprises obtaining calibration data from a plurality of 
5 spaced-apart locations of a calibration target, obtaining spectral data from a tissue sample at 
regions corresponding to the spaced-apart locations, and calibrating the spectral data using the 
calibration data. Methods of the invention also comprise obtaining calibration data from spaced- 
apart locations of a calibration target using an optical instrument with a disposable sheath, and 
using the data to calibrate subsequently-obtained spectral data from a tissue sample. Moreover, 
10 the invention comprises correcting spectral data for internal stray light effects by performing 
"null target" and "open air" tests. 
Evaluating Image Focus 

[0067] The invention provides methods of focusing an instrument for the acquisition of optical 
data from a tissue sample. Methods of the invention allow rapid focusing in the context of a 

15 diagnostic procedure in which rapid data acquisition is desirable. For example, inventive 
methods allow a user to focus an optical instrument quickly enough to obtain data within an 
optimal window of time following application of an agent to the tissue. 
[0068] The invention comprises projecting light spots onto a tissue sample, superimposing 
focusing elements, and aligning the light spots substantially within the focusing elements. In one 

20 embodiment a user focuses an optical instrument by aligning laser spots projected onto a tissue 
sample within rings that are superimposed at predetermined locations within the user's visual 
field. The user aligns the spots within the rings by manually adjusting the instrument. 
Alternatively, methods provide automatic adjustment of the instrument to properly align the 
spots within the focusing rings. 

25 [0069] In preferred embodiments, the invention comprises projecting laser beams at fixed 
angles with respect to an objective axis. As the distance between the instrument and the tissue 
sample decreases, the laser spots appear to move closer together. Conversely, as the instrument 
moves farther from the tissue sample, the laser spots appear to move farther apart. Focusing 
rings are superimposed within a user's visual field at a position such that when the laser spots lie 

30 within the rings, optimal focus is achieved. 

[0070] Methods of the invention are useful for in vitro as well as in vivo applications. The 
color of the light spots is generally chosen to provide adequate contrast between the light spots 
and the tissue sample. In certain medical procedures, it is necessary to apply an agent, such as a 
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contrast agent, in order to increase diagnostic clarity. Often such procedures require that data be 
obtained within a defined window of opportunity. Methods of the invention are compatible with 
such procedures. For example, systems of the invention allow an optical scan of a tissue sample 
wherein both image and spectral data are obtained from each of a plurality of regions of the 

5 tissue sample. Each data point is keyed to its respective region, and the data may be used to 
characterize the condition of the tissue at each region. In one embodiment, spectral and image 
data are acquired from a tissue sample over an approximately 10 to 15 second interval of time. 
In other embodiments, the scanning time may be longer or shorter. Focusing is achieved quickly 
enough so that the tissue scan is completed within an optimal data acquisition window. 

10 [0071] Obstructions on the surface of the tissue, as well as surface roughness and other 

irregularities, may distort the focusing spots projected onto the tissue. As a result, some of the 
focusing spots may not be clearly visible or may be distorted such that it is impossible to align 
all the spots within the superimposed focusing elements. Accordingly, preferred methods of the 
invention comprise automatic alignment validation to detect the locations of the light spots and 

1 5 to determine whether the spots are sufficiently well-aligned. Automatic validation offers an 
additional advantage in that it avoids delays caused by discovering improper focus after an 
optical scan has begun. Thus, automatic validation facilitates obtaining optical data within a 
prescribed, optimal window of time following application of contrast agent. 
[0072] Automatic alignment validation may comprise iterative dynamic thresholding to isolate 

20 and determine locations of the projected laser spots. For example, a measure of greenness, 
blueness, redness, or other color (or combination of colors) associated with the laser spots may 
be determined from an image of the tissue obtained during or after alignment. In one 
embodiment, the method comprises performing morphological processing between thresholding 
iterations. This provides for the stepwise removal of elements of the image which are not laser 

25 spots. 

[0073] A validation algorithm determines whether a sufficient number of the laser spots are 
adequately aligned within the focusing elements such that an optical scan may begin. Moreover, 
methods of the invention are sufficiently robust such that consistent, validated focus levels are 
achieved over the lifetime of the optical instrument. Optical data obtained using the focusing 
30 methods of the invention may be accumulated over time and used as training data in evolving 
statistical classification schemes. 

[0074] Accordingly, the invention provides focusing methods comprising the steps of 
projecting light spots onto a tissue sample, superimposing focusing elements in a visual field, 
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and aligning the light spots substantially within the focusing elements. Preferred embodiments 
further comprise the step of validating the alignment of the light spots within the focusing 
elements. 

Visually Enhancing Images 
5 [0075] The invention provides methods of enhancing tissue sample images by filtering 
luminance values from an input image and transforming the filtered values to produce an 
enhanced image for use in diagnostic applications. Preferred methods of the invention further 
comprise application of image masks to filter the input image. 

[0076] According to the invention, luminance values corresponding to pixels on a tissue image 
10 are modified in order to improve image quality. Exemplary luminance value modification 
algorithms are provided in the detailed description below. The invention is particularly useful 
when an image is too dark or exhibits poor contrast. Image enhancement in those situations 
results in increased diagnostic accuracy by improving the ability to distinguish between 
diagnostic regions of the sample. 
1 5 [0077] Methods of the invention provide further improvements in image quality by masking 
regions of a sample that are obstructed or are part of a zone of diagnostic interest. For example, 
masking techniques remove or weight data from a portion of a sample that may be obstructed by 
mucus, foam, a medical instrument, glare, shadow, blood, or other barriers. Masking techniques 
also take into account portions of an image that lie outside a zone of diagnostic interest. For 
20 example, regions such as a tissue wall, an os, an edge surface, tissue in the vicinity of a smoke 
tube, or non-tissue portions of the sample, are processed as described below in practice of the 
invention. 

[0078] Accordingly, preferred methods of the invention provide image enhancement by 
filtering luminance values of an image that correspond to sample regions that are obstructed or 

25 are otherwise not of diagnostic interest, and applying a mathematical transformation based on 
luminance values from regions that are not filtered out. Thus, image correction is substantially 
based on portions of the image that are of the highest diagnostic relevance. 
[0079] Preferred methods of the invention comprise applying image masks to automatically 
identify portions of an image of tissue corresponding to obstructed regions of the tissue and 

30 regions that lie outside a zone of interest. Luminance values from the remaining portions of the 
image are used to determine parameters of a transformation algorithm. Finally, an enhanced 
image of the tissue sample is produced by algorithmic transformation of the input luminance 
values. 
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[0080] A transformation algorithm of the invention may be a piecewise linear transformation 
that serves to enhance image brightness and contrast. In one embodiment, the invention provides 
further contrast enhancement by spatially filtering output of the transformation in order to 
emphasize high frequency components of the image, such as edges and fine features. The 
invention may further comprise color-balancing in order to reduce redness. Further aspects and 
advantages of the invention are provided in the following detailed description thereof 
BRIEF DESCRIPTION OF THE DRAWINGS 

[0081] The objects and features of the invention can be better understood with reference to the 
drawings described below, and the claims. The drawings are not necessarily to scale, emphasis 
instead generally being placed upon illustrating the principles of the invention. In the drawings, 
like numerals are used to indicate like parts throughout the various views. The patent or 
application file contains at least one drawing executed in color. Copies of this patent or patent 
application publication with color drawing(s) will be provided by the U.S. Patent and Trademark 
Office upon request and payment of the necessary fee. 

[0082] While the invention is particularly shown and described herein with reference to 
specific examples and specific embodiments, it should be understood by those skilled in the art 
that various changes in form and detail may be made therein without departing from the spirit 
and scope of the invention. 

[0083] Figure 1 is a block diagram featuring components of a tissue characterization system 
according to an illustrative embodiment of the invention. 

[0084] Figure 2 is a schematic representation of components of the instrument used in the 
tissue characterization system of Figure 1 to obtain spectral data and image data from a tissue 
sample according to an illustrative embodiment of the invention. 

[0085] Figure 3 is a block diagram of the instrument used in the tissue characterization system 
of Figure 1 according to an illustrative embodiment of the invention. 
[0086] Figure 4 depicts a probe within a calibration port according to an illustrative 
embodiment of the invention. 

[0087] Figure 5 depicts an exemplary scan pattern used by the instrument of Figure 1 to obtain 
spatially-correlated spectral data and image data from a tissue sample according to an illustrative 
embodiment of the invention. 

[0088] Figure 6 depicts front views of four exemplary arrangements of illumination sources 
about a probe head according to various illustrative embodiments of the invention. 
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[0089] Figure 7 depicts exemplary illumination of a region of a tissue sample using light 
incident to the region at two different angles according to an illustrative embodiment of the 
invention. 

[0090] Figure 8 depicts illumination of a cervical tissue sample using a probe and a speculum 
5 according to an illustrative embodiment of the invention. 

[0091] Figure 9 is a schematic representation of an accessory device for a probe marked with 
identifying information in the form of a bar code according to an illustrative embodiment of the 
invention. 

[0092] Figure 1 0 is a block diagram featuring spectral data calibration and correction 
10 components of the tissue characterization system of Figure 1 according to an illustrative 
embodiment of the invention. 

[0093] Figure 1 1 is, a block diagram featuring the spectral data pre-processing component of 
the tissue characterization system of Figure 1 according to an illustrative embodiment of the 
invention- 

15 [0094] Figure 12 shows a graph depicting reflectance spectral intensity as a function of 
wavelength using an open air target according to an illustrative embodiment of the invention. 
[0095] Figure 1 3 shows a graph depicting reflectance spectral intensity as a function of 
wavelength using a null target according to an illustrative embodiment of the invention. 
[0096] Figure 14 shows a graph depicting fluorescence spectral intensity as a function of 

20 wavelength using an open air target according to an illustrative embodiment of the invention. 
[0097] Figure 1 5 shows a graph depicting fluorescence spectral intensity as a function of 
wavelength using a null target according to an illustrative embodiment of the invention. 
[0098] Figure 1 6 is a representation of regions of a scan pattern and shows values of 
broadband reflectance intensity at each region using an open air target according to an 

25 illustrative embodiment of the invention. 

[0099] Figure 1 7 shows a graph depicting as a function of wavelength the ratio of reflectance 
spectral intensity using an open air target to the reflectance spectral intensity using a null target 
according to an illustrative embodiment of the invention. 

[0100] Figure 1 8 shows a graph depicting as a function of wavelength the ratio of fluorescence 
30 spectral intensity using an open air target to the fluorescence spectral intensity using a null target 
according to an illustrative embodiment of the invention. 
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[0101] Figure 19 is a photograph of a customized target for factory/preventive maintenance 
calibration and for pre-patient calibration of the instrument used in the tissue characterization 
system of Figure 1 according to an illustrative embodiment of the invention. 
[0102] Figure 20 is a representation of the regions of the customized target of Figure 19 that 
5 are used to calibrate broadband reflectance spectral data according to an illustrative embodiment 
of the invention. 

[0103] Figure 21 shows a graph depicting as a function of wavelength the mean reflectivity of 
the 10% diffuse target of Figure 19 over the non-masked regions shown in Figure 20, measured 
using the same instrument on two different days according to an illustrative embodiment of the 
10 invention. 

[0104] Figure 22A shows a graph depicting, for various individual instruments, curves of 
reflectance intensity (using the BB1 light source), each instrument curve representing a mean of 
reflectance intensity values for regions confirmed as metaplasia by impression and filtered 
according to an illustrative embodiment of the invention. 
15 [0105] Figure 22B shows a graph depicting, for various individual instruments, curves of 
reflectance intensity of the metaplasia-by-impression regions of Figure 22 A, after adjustment 
according to an illustrative embodiment of the invention. 

[0106] Figure 23 shows a graph depicting the spectral irradiance of a NIST traceable Quartz- 
Tungsten-Halogen lamp, along with a model of a blackbody emitter, used for determining an 
20 instrument response correction for fluorescence intensity data according to an illustrative 
embodiment of the invention. 

[0107] Figure 24 shows a graph depicting as a function of wavelength the fluorescence 
intensity of a dye solution at each region of a 499-point scan pattern according to an illustrative 
embodiment of the invention. 
25 [0108] Figure 25 shows a graph depicting as a function of scan position the fluorescence 
intensity of a dye solution at a wavelength corresponding to a peak intensity seen in Figure 24 
according to an illustrative embodiment of the invention. 

[0109] Figure 26 shows a graph depicting exemplary mean power spectra for various 
individual instruments subject to a noise performance criterion according to an illustrative 
30 embodiment of the invention. 

[0110] Figure 27 A is a block diagram featuring steps an operator performs in relation to a 
patient scan using the system of Figure 1 according to an illustrative embodiment of the 
invention. 
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[0111] Figure 27B is a block diagram featuring steps that the system of Figure 1 performs 
during acquisition of spectral data in a patient scan to detect and compensate for movement of 
the sample during the scan. 

[0112] Figure 28 is a block diagram showing the architecture of a video system used in the 
5 system of Figure 1 and how it relates to other components of the system of Figure 1 according to 
an illustrative embodiment of the invention. 

[0113] Figure 29A is a single video image of a target of 1 0% diffuse reflectivity upon which 
an arrangement of four laser spots is projected in a target focus validation procedure according to 
an illustrative embodiment of the invention. 
10 [0114] Figure 29B depicts the focusing image on the target in Figure 29A with superimposed 
focus rings viewed by an operator through a viewfinder according to an illustrative embodiment 
of the invention. 

[0115] Figure 30 is*a block diagram of a target focus validation procedure according to an 
illustrative embodiment of the invention. 
15 [0116] Figure 3 1 illustrates some of the steps of the target focus validation procedure of Figure 
30 as applied to the target in Figure 29 A. 

[0117] Figure 32A represents the green channel of an RGB image of a cervical tissue sample, 
used in a target focus validation procedure according to an illustrative embodiment of the 
invention. 

20 [0118] Figure 32B represents an image of the final verified laser spots on the cervical tissue 
sample of Figure 32A, verified during application of the target focus validation procedure of 
Figure 30 according to an illustrative embodiment of the invention. 
[0119] Figure 33 depicts a cervix model onto which laser spots are projected during an 
exemplary application of the target focus validation procedure of Figure 30, where the cervix 

25 model is off-center such that the upper two laser spots fall within the os region of the cervix 
model, according to an illustrative embodiment of the invention. 

[0120] Figure 34 shows a graph depicting, as a function of probe position, the mean of a 
measure of focus of each of the four laser spots projected onto the off-center cervix model of 
Figure 33 in the target focus validation procedure of Figure 30, according to an illustrative 
30 embodiment of the invention. 

[0121] Figure 35 shows a series of graphs depicting mean reflectance spectra for CIN 2/3 and 
non-CIN 2/3 tissues at a time prior to application of acetic acid, at a time corresponding to 
maximum whitening, and at a time corresponding to the latest time at which data was obtained - 
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used in determining an optimal window for obtaining spectral data according to an illustrative 
embodiment of the invention. 

[0122] Figure 36 shows a graph depicting the reflectance discrimination function spectra 
useful for differentiating between CIN 2/3 and non-CIN 2/3 tissues, used in determining an 
5 optimal window for obtaining spectral data according to an illustrative embodiment of the 
invention. 

[0123] Figure 37 shows a graph depicting the performance of two LDA (linear discriminant 
analysis) models as applied to reflectance data obtained at various times following application of 
acetic acid, used in determining an optimal window for obtaining spectral data according to an 

1 0 illustrative embodiment of the invention. 

[0124] Figure 38 shows a series of graphs depicting mean fluorescence spectra for CIN 2/3 and 
non-CIN 2/3 tissues at a time prior to application of acetic acid, at a time corresponding to 
maximum whitening,Tind at a time corresponding to the latest time at which data was obtained, 
used in determining an optimal window for obtaining spectral data according to an illustrative 

1 5 embodiment of the invention. 

[0125] Figure 39 shows a graph depicting the fluorescence discrimination function spectra 
useful for differentiating between CIN 2/3 and non-CIN 2/3 tissues in determining an optimal 
window for obtaining spectral data according to an illustrative embodiment of the invention. 
[0126] Figure 40 shows a graph depicting the performance of two LDA (linear discriminant 

20 analysis) models as applied to fluorescence data obtained at various times following application 
of acetic acid, used in determining an optimal window for obtaining spectral data according to an 
illustrative embodiment of the invention. 

[0127] Figure 41 shows a graph depicting the performance of three LDA models as applied to 
data obtained at various times following application of acetic acid, used in determining an 
25 optimal window for obtaining spectral data according to an illustrative embodiment of the 
invention. 

[0128] Figure 42 shows a graph depicting the determination of an optimal time window for 
obtaining diagnostic optical data using an optical amplitude trigger, according to an illustrative 
embodiment of the invention. 
30 [0129] Figure 43 shows a graph depicting the determination of an optimal time window for 
obtaining diagnostic data using a rate of change of mean reflectance signal trigger, according to 
an illustrative embodiment of the invention. 
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[0130] Figure 44A represents a 480 x 500 pixel image from a sequence of images of in vivo 
human cervix tissue and shows a 256 x 256 pixel portion of the image from which data is used in 
deterrnining a correction for a misalignment between two images from a sequence of images of 
the tissue in the tissue characterization system of Figure 1, according to an illustrative 

5 embodiment of the invention. 

[0131] Figure 44B depicts die image represented in Figure 44A and shows a 128 x 128 pixel 
portion of the image, made up of 16 individual 32 x 32 pixel validation cells, from which data is 
used in performing a validation of the misalignment correction determination according to an 
illustrative embodiment of the invention. 

10 [0132] Figure 45 is a schematic flow diagram depicting steps in a method of deternuning a 
correction for image misalignment in the tissue characterization system of Figure 1 , according to 
an illustrative embodiment of the invention. 

[0133] Figures 46Aand 46B show a schematic flow diagram depicting steps in a version of the 
method shown in Figure 45 of determining a correction for image misalignment according to an 
15 illustrative embodiment of the invention. 

[0134] Figures 47A and 47B show a schematic flow diagram depicting steps in a version of the 
method shown in Figure 45 of determining a correction for image misalignment according to an 
illustrative embodiment of the invention. 

[0135] Figures 48A-F depict a subset of adjusted images from a sequence of images of a tissue 
20 with an overlay of gridlines showing the validation cells used in validating the determinations of 
misalignment correction between the images according to an illustrative embodiment of the 
invention. 

[0136] Figure 49A depicts a sample image after application of a 9-pixel size (9 x 9) Laplacian 
of Gaussian filter (LoG 9 filter) on an exemplary image from a sequence of images of tissue, 
25 used in deterniining a correction for image misalignment, according to an illustrative 
embodiment of the invention. 

[0137] Figure 49B depicts the application of both a feathering technique and a Laplacian of 
Gaussian filter on the exemplary image used in Figure 49A to account for border processing 
effects, used in determining a correction for image misalignment according to an illustrative 
30 embodiment of the invention. 

[0138] Figure 50A depicts a sample image after application of a LoG 9 filter on an exemplary 
image from a sequence of images of tissue, used in determining a correction for image 
misalignment according to an illustrative embodiment of the invention. 
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[0139] Figure SOB depicts the application of both a Hamming window technique and a LoG 9 
filter on the exemplary image in Figure 50A to account for border processing effects in the 
determination of a correction for image misalignment according to an illustrative embodiment of 
the invention. 

5 [0140] Figures 5 1 A-F depict the determination of a correction for image misalignment using 
methods including the application of LoG filters of various sizes, as well as the application of a 
Hamming window technique and a feathering technique according to illustrative embodiments of 
the invention. 

[0141] Figure 52 shows a graph depicting exemplary mean values of reflectance spectral data 
10 as a function of wavelength for tissue regions affected by glare, tissue regions affected by 
shadow, and tissue regions affected by neither glare nor shadow according to an illustrative 
embodiment of the invention. 

[0142] Figure 53 shows a graph depicting mean values and standard deviations of broadband 
reflectance spectral data using the BB1 channel light source for regions confirmed as being 
15 obscured by blood, obscured by mucus, obscured by glare from the BB1 source, obscured by 
glare from the BB2 source, or unobscured, according to an illustrative embodiment of the 
invention. 

[0143] Figure 54 shows a graph depicting mean values and standard deviations of broadband 
reflectance spectral data using the BB2 channel light source for regions confirmed as being 
20 obscured by blood, obscured by mucus, obscured by glare from the BB 1 source, obscured by 
glare from the BB2 source, or unobscured, according to an illustrative embodiment of the 
invention. 

[0144] Figure 55 shows a graph depicting the weighted difference between the mean 
reflectance values of glare-obscured regions and unobscured regions of tissue as a function of 
25 wavelength used in determining metrics for application in the arbitration step in Figure 1, 
according to an illustrative embodiment of the invention. 

[0145] Figure 56 shows a graph depicting the weighted difference between the mean 
reflectance values of blood-obscured regions and unobscured regions of tissue as a function of 
wavelength used in determining metrics for application in the arbitration step in Figure 1, 
30 according to an illustrative embodiment of the invention. 
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[0146] Figure 57 shows a graph depicting the weighted difference between the mean 
reflectance values of mucus-obscured regions and unobscured regions of tissue as a function of 
wavelength, used in determining metrics for application in the arbitration step in Figure 1 
according to an illustrative embodiment of the invention. 
5 [0147] Figure 58 shows a graph depicting a ratio of the weighted differences between the mean 
reflectance values of glare-obscured regions and unobscured regions of tissue at two 
wavelengths, used in determining metrics for application in the arbitration step in Figure 1 
according to an illustrative embodiment of the invention. 

[0148] Figure 59 shows a graph depicting a ratio of the weighted differences between the mean 
10 reflectance values of blood-obscured regions and unobscured regions of tissue at two 

wavelengths, used in determining metrics for application in the arbitration step in Figure 1 
according to an illustrative embodiment of the invention. 

[0149] Figure 60 sEbws a graph depicting a ratio of the weighted differences between the mean 
reflectance values of mucus-obscured regions and unobscured regions of tissue at two 
15 wavelengths, used in determining metrics for application in the arbitration step in Figure 1 
according to an illustrative embodiment of the invention. 

[0150] Figure 61 shows a graph depicting as a function of wavelength mean values and 
confidence intervals of a ratio of BB1 and BB2 broadband reflectance spectral values for regions 
confirmed as being either glare-obscured or shadow-obscured tissue, used in determining metrics 
20 for application in the arbitration step in Figure 1 according to an illustrative embodiment of the 
invention. 

[0151] Figure 62 shows a graph depicting BB1 and BB2 broadband reflectance spectral data 
for a region of tissue where the BB1 data is affected by glare but the BB2 data is not, according 
to an illustrative embodiment of the invention. 
25 [0152] Figure 63 shows a graph depicting BB1 and BB2 broadband reflectance spectral data 
for a region of tissue where the BB2 data is affected by shadow but the BB1 data is not, 
according to an illustrative embodiment of the invention. 

[0153] Figure 64 shows a graph depicting BB 1 and BB2 broadband reflectance spectral data 
for a region of tissue that is obscured by blood, according to an illustrative embodiment of the 
30 invention. 

[0154] Figure 65 shows a graph depicting BB1 and BB2 broadband reflectance spectral data 
for a region of tissue that is unobscured, according to an illustrative embodiment of the 
invention. 
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[0155] Figure 66 shows a graph depicting the reduction in the variability of broadband 
reflectance measurements of CIN 2/3-confirmed tissue produced by applying the metrics in the 
arbitration step 128 of Figure 1 to remove data affected by an artifact, according to an illustrative 
embodiment of the invention. 

5 [0156] Figure 67 shows a graph depicting the reduction in the variability of broadband 
reflectance measurements of tissue classified as "no evidence of disease confirmed by 
pathology" produced by applying the metrics in the arbitration step 128 of Figure 1 to remove 
data aifected by an artifact, according to an illustrative embodiment of the invention. 
[0157] Figure 68 shows a graph depicting the reduction in the variability of broadband 

10 reflectance measurements of tissue classified as "metaplasia by impression" produced by 

applying the metrics in the arbitration step 128 of Figure 1 to remove data affected by an artifact, 
according to an illustrative embodiment of the invention. 

[0158] Figure 69 shows a graph depicting the reduction in the variability of broadband 
reflectance measurements of tissue classified as "normal by impression" produced by applying 
15 the metrics in the arbitration step 128 of Figure 1 to remove data affected by an artifact, 
according to an illustrative embodiment of the invention. 

[0159] Figure 70A depicts an exemplary image of cervical tissue divided into regions for 
which two types of reflectance spectral data and one type of fluorescence spectral data are 
obtained, according to an illustrative embodiment of the invention. 
20 [0160] Figure 70B is a representation of the regions depicted in Figure 70A and shows the 
categorization of each region using the metrics in the arbitration step 128 of Figure 1 , according 
to an illustrative embodiment of the invention. 

[0161] Figure 71 A depicts an exemplary image of cervical tissue divided into regions for 
which two types of reflectance spectral data and one type of fluorescence spectral data are 
25 obtained, according to an illustrative embodiment of the invention. 

[0162] Figure 71B is a representation of the regions depicted in Figure 71 A and shows the 
categorization of each region using the metrics in the arbitration step 128 of Figure 1 , according 
to an illustrative embodiment of the invention. 

[0163] Figure 72A depicts an exemplary image of cervical tissue divided into regions for 
30 which two types of reflectance spectral data and one type of fluorescence spectral data are 
obtained, according to an illustrative embodiment of the invention. 



WO 2004/005895 



PCT/US2003/021347 



-28- 

[0164] Figure 72B is a representation of the regions depicted in Figure 72A and shows the 
categorization of each region using the metrics in the arbitration step 128 of Figure 1, according 
to an illustrative embodiment of the invention. 

[0165] Figure 73 is a block diagram depicting steps in a method of processing and combining 
5 spectral data and image data obtained in the tissue characterization system of Figure 1 to 

determine states of health of regions of a tissue sample, according to an illustrative embodiment 
of the invention. 

[0166] Figure 74 is a block diagram depicting steps in the method of Figure 73 in further 
detail, according to an illustrative embodiment of the invention. 

] 0 [0167] Figure 75 shows a scatter plot depicting discrimination between regions of normal 
squamous tissue and CIN 2/3 tissue for known reference data, obtained by comparing 
fluorescence intensity at about 460 nm to a ratio of fluorescence intensities at about 505 nm and 
about 410 nm, used iff determining an NED spectral mask (NED spe c) according to an illustrative 
embodiment of the invention. ■ 

1 5 [0168] Figure 76 shows a graph depicting as a function of wavelength mean broadband 

reflectance values for known normal squamous tissue regions and known CIN 2/3 tissue regions, 
used in determining an NED spectral mask (NEDspec) according to an illustrative embodiment of 
the invention. 

[0169] Figure 77 shows a graph depicting as a function of wavelength mean fluorescence 
20 intensity values for known squamous tissue regions and known CIN 2/3 tissue regions, used in 
determining an NED spectral mask (NEDsp CC ) according to an illustrative embodiment of the 
invention. 

[0170] Figure 78 shows a graph depicting values of a discrimination function using a range of 
numerator wavelengths and denominator wavelengths in the discrimination analysis between 

25 known normal squamous tissue regions and known CIN 2/3 tissue regions, used in determining 
an NED spectral mask (NED spec ) according to an illustrative embodiment of the invention. 
[0171] Figure 79 A depicts an exemplary reference image of cervical tissue from a patient scan 
in which spectral data is used in arbitration, NED spectral masking, and statistical classification 
of interrogation points of the tissue sample, according to an illustrative embodiment of the 

30 invention. 

[0172] Figure 79B is a representation (obgram) of the interrogation points (regions) of the 
tissue sample depicted in Figure 79A and shows points classified as "filtered" following 
arbitration, "masked" following NED spectral masking with two different sets of parameters, and 
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"CIN 2/3" following statistical classification, according to an illustrative embodiment of the 
invention. 

[0173] Figure 79C is a representation (obgram) of the interrogation points (regions) of the 
tissue sample depicted in Figure 79A and shows points classified as "filtered" following 
5 arbitration, "masked" following NED spectral masking with two different sets of parameters, and 
"CIN 2/3" following statistical classification, according to an illustrative embodiment of the 
invention. 

[0174] Figure 79D is a representation (obgram) of the interrogation points (regions) of the 
tissue sample depicted in Figure 79A and shows points classified as "filtered" following 
10 arbitration, "masked" following NED spectral masking with two different sets of parameters, and 
"CIN 2/3" following statistical classification, according to an illustrative embodiment of the 
invention. 

[0175] Figure 80 shows a graph depicting fluorescence intensity as a function of wavelength 
from an interrogation point confirmed as invasive carcinoma by pathology and necrotic tissue by 
15 impression, used in determining a Necrosis spectral mask according to an illustrative 
embodiment of the invention. 

[0176] Figure 81 shows a graph depicting broadband reflectance BB1 and BB2 as functions of 
wavelength from an interrogation point confirmed as invasive carcinoma by pathology and 
necrotic tissue by impression, used in determining a Necrosis spectral mask according to an 

20 illustrative embodiment of the invention. 

[0177] Figure 82A depicts an exemplary reference image of cervical tissue from the scan of a 
patient confirmed as having advanced invasive cancer in which spectral data is used in 
arbitration, Necrosis spectral masking, and statistical classification of interrogation points of the 
tissue sample, according to an illustrative embodiment of the invention. 

25 [0178] Figure 82B is a representation (obgram) of the interrogation points (regions) of the 
tissue sample depicted in Figure 82A and shows points classified as "filtered" following 
arbitration, "masked" following application of the "Porphyrin" and "FAD" portions of the 
Necrosis spectral mask, and "CIN 2/3" following statistical classification, according to an 
illustrative embodiment of the invention. 

30 [0179] Figure 83 shows a graph depicting as a function of wavelength mean broadband 

reflectance values for known cervical edge regions and known CIN 2/3 tissue regions, used in a 
discrimination analysis to determine a cervical edge/vaginal wall ([CE]sp e c) spectral mask 
according to an illustrative embodiment of the invention. 
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[0180] Figure 84 shows a graph depicting as a function of wavelength mean fluorescence 
intensity values for known cervical edge regions and known CIN 2/3 tissue regions, used in a 
discrimination analysis to determine a cervical edge/vaginal wall ([CE] sp cc) spectral mask 
according to an illustrative embodiment of the invention. 
5 [0181] Figure 85 shows a graph depicting as a function of wavelength mean broadband 

reflectance values for known vaginal wall regions and known CENf 2/3 tissue regions, used in a 
discrimination analysis to determine a cervical edge/vaginal wall ([CE]sp C c) spectral mask 
according to an illustrative embodiment of the invention. 

[0182] Figure 86 shows a graph depicting as a function of wavelength mean fluorescence 
10 intensity values for known vaginal wall regions and known CIN 2/3 tissue regions, used in a 
discrimination analysis to determine a cervical edge/vaginal wall ([CE] spe c) spectral mask 
according to an illustrative embodiment of the invention. 

[0183] Figure 87A depicts an exemplary reference image of cervical tissue from a patient scan 
in which spectral data is used in arbitration and cervical edge/vaginal wall ([CE]^) spectral 

15 masking, according to an illustrative embodiment of the invention. 

[0184] Figure 87B is a representation (obgram) of the interrogation points (regions) of the 
tissue sample depicted in Figure 87A and shows points classified as "filtered" following 
arbitration and "masked" following cervical edge/vaginal wall ([CE] sp cc) spectral masking, 
according to an illustrative embodiment of the invention. 

20 [0185] Figure 88 shows a graph depicting as a function of wavelength mean broadband 

reflectance values for known pooling fluids regions and known CIN 2/3 tissue regions, used in a 
discrimination analysis to determine a fluids/mucus ([MU]spec) spectral mask according to an 
illustrative embodiment of the invention. 

[0186] Figure 89 shows a graph depicting as a function of wavelength mean fluorescence 
25 intensity values for known pooling fluids regions and known CIN 2/3 tissue regions, used in a 
discrimination analysis to determine a fluids/mucus ([MU] spC c) spectral mask according to an 
illustrative embodiment of the invention. 

[0187] Figure 90 shows a graph depicting as a function of wavelength mean broadband 
reflectance values for known mucus regions and known CIN 2/3 tissue regions, used in a 
30 discrimination analysis to determine a fluids/mucus ([MU] S pec) spectral mask according to an 
illustrative embodiment of the invention. 

[0188] Figure 91 shows a graph depicting as a function of wavelength mean fluorescence 
intensity values for known mucus regions and known CIN 2/3 tissue regions, used in a 
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discrimination analysis to determine a fluids/mucus ([MU] spC c) spectral mask according to an 
illustrative embodiment of the invention. 

[0189] Figure 92A depicts an exemplary reference image of cervical tissue from a patient scan 
in which spectral data is used in arbitration and fluids/mucus ([MU] sp ec) spectral masking, 

5 according to an illustrative embodiment of the invention. 

[0190] Figure 92B is a representation (obgram) of the interrogation points (regions) of the 
tissue sample depicted in Figure 92A and shows points classified as "filtered" following 
arbitration and "masked" following fluids/mucus ([MU] spcc ) spectral masking, according to an 
illustrative embodiment of the invention. 

10 [0191] Figure 93 depicts image masks determined from an image of a tissue sample and shows 
how the image masks are combined with respect to each spectral interrogation point (region) of 
the tissue sample, according to an illustrative embodiment of the invention. 
[0192] Figure 94A depicts an exemplary image of cervical tissue obtained during a patient 
examination and used in determining a corresponding glare image mask, Glares, according to 

15 an illustrative embodiment of the invention. 

[0193] Figure 94B represents a glare image mask, Glare^, corresponding to the exemplary 
image in Figure 94A, according to an illustrative embodiment of the invention. 
[0194] Figure 95 is a block diagram depicting steps in a method of determining a glare image 
mask, Glareyid, for an image of cervical tissue, according to an illustrative embodiment of the 

20 invention. 

[0195] Figure 96 shows a detail of a histogrammed in a method of determining a glare image 
mask, Glare v id, for an image of cervical tissue, according to an illustrative embodiment of the 
invention. 

[0196] Figure 97 A depicts an exemplary image of cervical tissue obtained during a patient 
25 examination and used in determining a corresponding region-of-interest image mask, [ROI] Y id, 
according to an illustrative embodiment of the invention. 

[0197] Figure 97B represents a region-of-interest image mask, [ROI]vid, corresponding to the 
exemplary image in Figure 120 A, according to an illustrative embodiment of the invention. 
[0198] Figure 98 is a block diagram depicting steps in a method of determining a region-of- 
30 interest image mask, [ROI] V id, for an image of cervical tissue, according to an illustrative 
embodiment of the invention. 
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[0199] Figure 99A depicts an exemplary image of cervical tissue obtained during a patient 
examination and used in determining a corresponding smoke tube image mask, [ST] V id, 
according to an illustrative embodiment of the invention. 

[0200] Figure 99B represents a smoke tube image mask, [ST] v id, corresponding to the 
5 exemplary image in Figure 99A, according to an illustrative embodiment of the invention. 

[0201] Figure 1 00 is a block diagram depicting steps in a method of determining a smoke tube 
image mask, [ST] V id> for an image of cervical tissue, according to an illustrative embodiment of 
the invention. 

[0202] Figure 1 01 A depicts an exemplary image of cervical tissue obtained during a patient 
10 examination and used in determining a corresponding os image mask, Os v id, according to an 
illustrative embodiment of the invention. 

[0203] Figure 101B represents an os image mask, Os V id, corresponding to the exemplary image 
in Figure 101 A, according to an illustrative embodiment of the invention. 
[0204] Figure 1 02 is a block diagram depicting steps in a method of determining an os image 
15 mask, Osyid, for an image of cervical tissue, according to an illustrative embodiment of the 
invention. 

[0205] Figure 103 A depicts an exemplary image of cervical tissue obtained during a patient 
examination and used in determining a corresponding blood image mask, Blood V jd, according to 
an illustrative embodiment of the invention. 
20 [0206] Figure 1 03B represents a blood image mask, Bloody, corresponding to the exemplary 
image in Figure 103 A, according to an illustrative embodiment of the invention. 
[0207] Figure 104 is a block diagram depicting steps in a method of determining a blood 
image mask, Blooded, for an image of cervical tissue, according to an illustrative embodiment of 
the invention. 

25 [0208] Figure 105 A depicts an exemplary image of cervical tissue obtained during a patient 
examination and used in determining a corresponding mucus image mask, Mucusvid, according to 
an illustrative embodiment of the invention. 

[0209] Figure 105B represents a mucus image mask, MucuSyjd, corresponding to the exemplary 
reference image in Figure 105 A, according to an illustrative embodiment of the invention. 
30 [0210] Figure 1 06 is a block diagram depicting steps in a method of determining a mucus 
image mask, Mucus v id> for an image of cervical tissue, according to an illustrative embodiment 
of the invention. 



WO 2004/005895 



PCT/US2003/021347 



-33- 

[0211] Figure 107A depicts an exemplary reference image of cervical tissue obtained during a 
patient examination and used in determining a corresponding speculum image mask, [SP] V id, 
according to an illustrative embodiment of the invention. 

[0212] Figure 107B represents a speculum image mask, [SP] V id, corresponding to the 
5 exemplary image in Figure 107 A, according to an illustrative embodiment of the invention. 
[0213] Figure 108 is a block diagram depicting steps in a method of determining a speculum 
image mask, [SP]vid> for an image of cervical tissue, according to an illustrative embodiment of 
the invention. 

[0214] Figure 109A depicts an exemplary image of cervical tissue obtained during a patient 
10 examination and used in determining a vaginal wall image mask, [VW] V id, according to an 
illustrative embodiment of the invention. 

[0215] Figure 109B represents the image of Figure 109A overlaid with a vaginal wall image 
mask, [VW]vid> following extension, determined according to an illustrative embodiment of the 
invention. 

1 5 [0216] Figure 1 1 0 is a block diagram depicting steps in a method of determining a vaginal wall 
image mask, [VW]vid> for an image of cervical tissue, according to an illustrative embodiment of 
the invention. 

[0217] Figure 1 1 1 A depicts an exemplary image of cervical tissue obtained during a patient 
examination and used in determining a corresponding fluid-and-foam image mask, [FL] V id, 

20 according to an illustrative embodiment of the invention. 

[0218] Figure 1 1 IB represents a fluid-and-foam image mask, [FL] V id, corresponding to the 
exemplary image in Figure 1 1 1 A, according to an illustrative embodiment of the invention. 
[0219] Figure 1 12 is a block diagram depicting steps in a method of determining a fluid-and- 
foam image mask, [FL]vid> for an image of cervical tissue, according to an illustrative 

25 embodiment of the invention. 

[0220] Figures 1 1 3 A-C show graphs representing a step in a method of image visual 
enhancement in which a piecewise linear transformation of an input image produces an output 
image with enhanced image brightness and contrast, according to one embodiment of the 
invention. 

30 [0221] Figure 1 14A depicts an exemplary image of cervical tissue obtained during a patient 
examination and used as a reference (base) image in a method of disease probability display, 
according to one embodiment of the invention. 
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[0222] Figure 1 1 4B depicts the output overlay image corresponding to the reference image in 
Figure 1 14A, produced using a method of disease probability display according to one 
embodiment of the invention. 

[0223] Figure 1 1 5 A represents a disease display layer produced in a method of disease 
5 probability display for the reference image in Figure 1 14A, wherein CIN 2/3 probabilities at 
interrogation points are represented by circles with intensities scaled by CIN 2/3 probability, 
according to one embodiment of the invention. 

[0224] Figure 1 1 5B represents the disease display layer of Figure 1 14B following filtering 
using a Hamming filter, according to one embodiment of the invention. 
1 0 [0225] Figure 1 1 6 represents the color transformation used to determine the disease display 
layer image in a disease probability display method, according to one embodiment of the 
invention. 

[0226] Figure 1 1 IK depicts an exemplary reference image of cervical tissue having necrotic 
regions, obtained during a patient examination and used as a reference (base) image in a method 
15 of disease probability display, according to one embodiment of the invention. 

[0227] Figure 1 17B depicts the output overlay image corresponding to the reference image in 
Figure 1 17A, including necrotic regions, indeterminate regions, and CEM 2/3 regions, and 
produced using a method of disease probability display according to one embodiment of the 
invention. 
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[0228] The Table of Contents above is provided as a general organizational guide to the 
Description of the Illustrative Embodiment. Entries in the Table do not serve to limit support for 
any given element of the invention to a particular section of the Description. 

System 100 overview 

[0229] The invention provides systems and methods for obtaining spectral data and image data 
from a tissue sample, for processing the data, and for using the data to diagnose the tissue 
sample. As used herein, "spectral data" from a tissue sample includes data corresponding to any 
wavelength of the electromagnetic spectrum, not just the visible spectrum. Where exact 
wavelengths are specified, alternate embodiments comprise using wavelengths within a db 5 nm 
range of the given value, within a ± 1 0 nm range of the given value, and within a =fc 25 nm range 
of the given value. As used herein, "image data" from a tissue sample includes data from a 
visual representation, such as a photo, a video frame, streaming video, and/or an electronic, 
digital or mathematical analogue of a photo, video frame, or streaming video. As used herein, a 
"tissue sample" may comprise, for example, animal tissue, human tissue, living tissue, and/or 
dead tissue. A tissue sample may be in vivo, in situ, ex vivo, or ex situ, for example. A tissue 
sample may comprise material in the vacinity of tissue, such as non-biological materials 
including dressings, chemical agents, and/or medical instruments, for example. 
[0230] Embodiments of the invention include obtaining data from a tissue sample, determining 
which data are of diagnostic value, processing the useful data to obtain a prediction of disease 
state, and displaying the results in a meaningful way. In one embodiment, spectral data and 
image data are obtained from a tissue sample and are used to create a diagnostic map of the 
tissue sample showing regions in which there is a high probability of disease. 
[0231] The systems and methods of the invention can be used to perform an examination of in 
situ tissue without the need for excision or biopsy. In an illustrative embodiment, the systems 
and methods are used to perform insitu examination of the cervical tissue of a patient in a non- 
surgical setting, such as in a doctor's office or examination room. The examination may be 
preceded or accompanied by a routine pap smear and/or colposcopic examination, and may be 
followed-up by treatment or biopsy of suspect tissue regions. 

[0232] Figure 1 depicts a block diagram featuring components of a tissue characterization 
system 100 according to an illustrative embodiment of the invention. Each component of the 
system 100 is discussed in more detail herein. The system includes components for acquiring 
data, processing data, calculating disease probabilities, and displaying results. 
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[0233] In the illustrative system 100 of Figure 1, an instrument 102 obtains spectral data and 
image data from a tissue sample. The instrument 102 obtains spectral data from each of a 
plurality of regions of the sample during a spectroscopic scan of the tissue 104. During a scan, 
video images of the tissue are also obtained by the instrument 102. Illustratively, one or more 

5 complete spectroscopic spectra are obtained for each of 500 discrete regions of a tissue sample 
during a scan lasting about 12 seconds. However, in other illustrative embodiments any number 
of discrete regions may be scanned and the duration of each scan may vary. Since in-situ tissue 
may shift due to involuntary or voluntary patient movement during a scan, video images are used 
to detect shifts of the tissue, and to account for the shifts in the diagnostic analysis of the tissue. 

10 Preferably, a detected shift is compensated for in real time 106. For example, as described below 
in further detail, one or more components of the instrument 102 may be automatically adjusted 
during the examination of a patient while spectral data are obtained in order to compensate for a 
detected shift caused By patient movement. Additionally or alternatively, the real-time tracker 
106 provides a correction for patient movement that is used to process the spectral data before 

1 5 calculating disease probabilities. In addition to using image data to track movement, the 

illustrative system 100 of Figure 1 uses image data to identify regions that are obstructed or are 
outside the areas of interest of a tissue sample 108. This feature of the system 100 of Figure 1 is 
discussed herein in more detail. 

[0234] The system 1 00 shown in Figure 1 includes components for performing factory tests 
20 and periodic preventive maintenance procedures 1 1 0, the results of which 1 1 2 are used to 

preprocess patient spectral data 114. In addition, reference spectral calibration data are obtained 
1 16 in an examination setting prior to each patient examination, and the results 118 of the pre- 
patient calibration are used along with the factory and preventive maintenance results 1 12 to 
preprocess patient spectral data 114. 
25 [0235] The instrument 102 of Figure 1 includes a frame grabber 120 for obtaining a video 

image of the tissue sample. A focusing method 122 is applied and video calibration is performed 
124. The corrected video data may then be used to compensate for patient movement during the 
spectroscopic data acquisition 104. The corrected video data is also used in image masking 108, 
which includes identifying obstructed regions of the tissue sample, as well as regions of tissue 
30 that lie outside an area of diagnostic interest. In one illustrative embodiment, during a patient 
scan, a single image is used to compute image masks 108 and to determine a brightness and 
contrast correction 126 for displaying diagnostic results. In illustrative alternative embodiments, 
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more than one image is used to create image masks and/or to determine a visual display 
correction. 

[0236] In the system of Figure 1 , spectral data are acquired 104 within a predetermined period 
of time following the application of a contrast agent, such as acetic acid, to the tissue sample. 
5 According to the illustrative embodiment, four raw spectra are obtained for each of 

approximately 500 regions of the tissue sample and are processed. A fluorescence spectrum, two 
broadband reflectance (backscatter) spectra, and a reference spectrum are obtained at each of the 
regions over a range from about 360 nm to about 720 nm wavelength. The period of time within 
which a scan is acquired is chosen so that the accuracy of the resulting diagnosis is maximized. 
10 In one illustrative embodiment, a spectral data scan of a cervical tissue sample is performed over 
an approximately 12-second period of time within a range between about 30 seconds and about 
130 seconds following application of acetic acid to the tissue sample. 

[0237] The illustrative system 1 00 includes data processing components for identifying data 
that are potentially non-representative of the tissue sample. Preferably, potentially non- 

15 representative data are either hard-masked or soft-masked. Hard-masking of data includes 

eliminating the identified, potentially non-representative data from further consideration. This 
results in an indeterminate diagnosis in the corresponding region. Hard masks are determined in 
components 128, 130, and 108 of the system 100. Soft masking includes applying a weighting 
function or weighting factor to the identified, potentially non-representative data. The weighting 

20 is taken into account during calculation of disease probability 132, and may or may not result in 
an indeterminate diagnosis in the corresponding region. Soft masks are determined in 
component 130 of the system 100. 

[0238] Soft masking provides a means of weighting spectral data according to the likelihood 
that the data is representative of clear, unobstructed tissue in a region of interest. For example, if 

25 the system 1 00 determines there is a possibility that one kind of data from a given region is 
affected by an obstruction, such as blood or mucus, that data is "penalized" by attributing a 
reduced weighting to that data during calculation of disease probability 132. Another kind of 
data from the same region that is determined by the system 100 not to be affected by the 
obstruction is more heavily weighted in the diagnostic step than the possibly-affected data, since 

30 the unaffected data is attributed a greater weighting in the calculation of disease probability 132. 
[0239] In the illustrative system 1 00, soft masking is performed in addition to arbitration of 
two or more redundant data sets. Arbitration of data sets is performed in component 128. In the 
illustrative embodiment, this type of arbitration employs the following steps: obtaining two sets 
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of broadband reflectance (backscatter) data from each region of the tissue sample using light 
incident to the region at two different angles; determining if one of the data sets is affected by an 
artifact such as shadow, glare, or obstruction; eliminating one of the redundant reflectance data 
sets so affected; and using the other data set in the diagnosis of the tissue at the region. If both of 

5 the data sets are unaffected by an artifact, a mean of the two sets is used. 

[0240] According to the illustrative embodiment, the instrument 102 obtains both video images 
and spectral data from a tissue sample. The spectral data may include fluorescence data and 
broadband reflectance (backscatter) data. The raw spectral data are processed and then used in a 
diagnostic algorithm to determine disease probability for regions of the tissue sample. 

10 According to the illustrative embodiment, both image data and spectral data are used to mask 
data that is potentially non-representative of unobstructed regions of interest of the tissue. In 
another illustrative embodiment, both the image data and the spectral data are alternatively or 
additionally used in tlTe diagnostic algorithm. 

[0241] The system 100 also includes a component 132 for determining a disease probability at 
15 each of a plurality of the approximately 500 interrogation points using spectral data processed in 
the components 128 and 130 and using the image masks determined in component 108. 
Illustratively, the disease probability component 132 processes spectral data with statistical 
and/or heuristics-based (non-statistically-derived) spectral classifiers 134, incorporates image 
and/or spectral mask information 136, and assigns a probability of high grade disease, such as 
20 CIN 2+, to each examined region of the tissue sample. The classifiers use stored, accumulated 
training data from samples of known disease state. The disease display component 138 
graphically presents regions of the tissue sample having the highest probability of high grade 
disease by employing a color map overlay of the cervical tissue sample. The disease display 
component 138 also displays regions of the tissue that are necrotic and/or regions at which a 
25 disease probability could not be determined. 

[0242] Each of the components of the illustrative system 100 is described in more detail 
below. 

Instrument - 102 

[0243] Figure 2 is a schematic representation of components of the instrument 102 used in the 
30 tissue characterization system 100 of Figure 1 to obtain spectral data and image data from a 

tissue sample according to an illustrative embodiment of the invention. The instrument of Figure 
2 includes a console 140 connected to a probe 142 by way of a cable 144. The cable 144 carries 
electrical and optical signals between the console 140 and the probe 142. In an alternative 
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embodiment, signals are transmitted between the console 140 and the probe 142 wirelessly, 
obviating the need for the cable 144. The probe 142 accommodates a disposable component 146 
that comes into contact with tissue and may be discarded after one use. The console 140 and the 
probe 142 are mechanically connected by an articulating arm 148, which can also support the 
cable 144. The console 140 contains much of the hardware and the software of the system, and 
the probe 142 contains the necessary hardware for making suitable spectroscopic observations. 
The details of the instrument 100 are further explained in conjunction with Figure 3. 
[0244] Figure 3 shows an exemplary operational block diagram 1 50 of an instrument 1 02 of 
the type depicted in Figure 2. Referring to Figures 1 and 2, in some illustrative embodiments the 
instrument 102 includes features of single-beam spectrometer devices, but is adapted to include 
other features of the invention. In other illustrative embodiments, the instrument 102 is 
substantially the same as double-beam spectrometer devices, adapted to include other features of 
the invention. In stUTother illustrative embodiments the instrument 1 02 employs other types of 
spectroscopic devices. In the depicted embodiment, the console 140 includes a computer 152, 
which executes software that controls the operation of the instrument 102. The software includes 
one or more modules recorded on machine-readable media such as magnetic disks, magnetic 
tape, CD-ROM, and semiconductor memory, for example. Preferably, the machine-readable 
medium is resident within the computer 1 52. In alternative embodiments, the machine-readable 
medium can be connected to the computer 152 by a communication link. However, in 
alternative embodiments, one can substitute computer instructions in the form of hardwired logic 
for software, or one can substitute firmware (i.e., computer instructions recorded on devices such 
as PROMs, EPROMS, EEPROMs, or the like) for software. The term machine-readable 
instructions as used herein is intended to encompass software, hardwired logic, firmware, object 
code and the like. 

[0245] The computer 152 of the instrument 102 is preferably a general purpose computer. The 
computer 152 can be, for example, an embedded computer, a personal computer such as a laptop 
or desktop computer, or another type of computer, that is capable of running the software, 
issuing suitable control commands, and recording information in real-time. The illustrative 
computer 152 includes a display 154 for reporting information to an operator of the instrument 
102, a keyboard 156 for enabling the operator to enter information and commands, and a printer 
158 for providing a print-out, or permanent record, of measurements made by the instrument 102 
and for printing diagnostic results, for example, for inclusion in the chart of a patient. According 
to the illustrative embodiment of the invention, some commands entered at the keyboard 156 
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enable a user to perform certain data processing tasks, such as selecting a particular spectrum for 
analysis, rejecting a spectrum, and/or selecting particular segments of a spectrum for 
normalization. Other commands enable a user to select the wavelength range for each particular 
segment and/or to specify both wavelength contiguous and non-contiguous segments. In one 
5 illustrative embodiment, data acquisition and data processing are automated and require little or 
no user input after initializing a scan. 

[0246] The illustrative console 140 also includes an ultraviolet (UV) source 160 such as a 
nitrogen laser or a frequency-tripled Nd:YAG laser, one or more white light sources 162 such as 
one, two, three, four, or more Xenon flash lamps, and control electronics 164 for controlling the 

10 light sources both as to intensity and as to the time of onset of operation and the duration of 
operation. One or more power supplies 166 are included in the illustrative console 140 to 
provide regulated power for the operation of all of the components of the instrument 102. The 
illustrative console 140 of Figure 3 also includes at least one spectrometer and at least one 
detector (spectrometer and detector 168) suitable for use with each of the light sources. In some 

15 illustrative embodiments, a single spectrometer operates with both the UV light source 1 60 and 
the white light source(s) 162. The same detector may record both UV and white light signals. 
However, in other illustrative embodiments, different detectors are used for each light source. 
[0247] The illustrative console 140 further includes coupling optics 170 to couple the UV 
illumination from the UV light source 160 to one or more optical fibers in the cable 144 for 

20 transmission to the probe 142, and coupling optics 172 for coupling the white light illumination 
from the white light source(s) 162 to one or more optical fibers in the cable 144 for transmission 
to the probe 142. The spectral response of a specimen to UV illumination from the UV light 
source 160 observed by the probe 142 is carried by one or more optical fibers in the cable 144 
for transmission to the spectrometer and detector 168 in the console 140. The spectral response 

25 of a specimen to the white light illumination from the white light source(s) 162 observed by the 
probe 142 is carried by one or more optical fibers in the cable 144 for transmission to the 
spectrometer and detector 168 in the console 140. As shown in Figure 3, the console 140 
includes a footswitch 174 to enable an operator of the instrument 102 to signal when it is 
appropriate to commence a spectral scan by stepping on the switch. In this manner, the operator 

30 has his or her hands free to perform other tasks, for example, aligning the probe 142. 

[0248] The console 140 additionally includes a calibration port 176 into which a calibration 
target may be placed for calibrating the optical components of the instrument 1 02. Illustratively, 
an operator places the probe 142 in registry with the calibration port 176 and issues a command 
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that starts the calibration operation. In illustrative calibration operation, a calibrated light source 
provides a calibration signal in the form of an illumination of known intensity over a range of 
wavelengths, and/or at a number of discrete wavelengths. The probe 142 detects the calibration 
signal, and transmits the detected signal through the optical fiber in the cable 144 to the 
5 spectrometer and detector 168. A test spectral result is obtained. A calibration of the spectral 
system can be computed as the ratio of the amplitude of the known illumination at a particular 
wavelength divided by the test spectral result at the same wavelength. Calibration may include 
factory calibration 110, preventive maintenance calibration 1 10, and/or pre-patient calibration 
1 16, as shown in the system 100 of Figure 1 . Pre-patient calibration 116 may be performed to 

10 account for patient-to-patient variation, for example. 

[0249] Figure 4 depicts the illustrative probe 142 of Figure 2 resting within a calibration port 
176 according to an illustrative embodiment of the invention. Referring to Figures 2-4, the 
illustrative calibration port 176 is adjustably attached to the probe 142 or the console 140 to 
allow an operator to perform pre-patient calibration without assembling detachable parts. The 

15 pre-patient calibration port may contain one or more pre-positioned calibration targets, such as a 
customized target 426 (see also Figure 19) and a null target 187, both described in more detail 
below. 

[0250] According to the illustrative embodiment, factory and/or preventive maintenance 
calibration includes using a portable, detachable calibration port to calibrate any number of 

20 individual units, allowing for a standardized calibration procedure among various instruments. 
Preferably, the calibration port 176 is designed to prevent stray room light or other external light 
from affecting a calibration measurement when a calibration target is in place in the calibration 
port 176. For example, as shown in Figure 4, the null target 187 can be positioned up against the 
probe head 192 by way of an actuator 189 such that the effect of external stray light is 

25 minimized. When not in use, the null target 1 87 is positioned out of the path of light between the • 
customized target 426 and the collection optics 200, as depicted in Figure 4. An additional 
fitting may be placed over the probe head 192 to further reduce the effect of external stray light. 
According to one illustrative embodiment, the target 187 in the calibration port 176 is located 
approximately 100 mm from the probe head 192; and the distance light travels from the target 

30 187 to the first optical component of the probe 142 is approximately 130 mm. The location of 
the target (in relation to the probe head 192) during calibration may approximate the location of 
tissue during a patient scan. 
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[0251] The illustrative probe 142 includes probe optics 178 for illuminating a specimen to be 
analyzed with UV light from the UV source 160 and for collecting the fluorescent and broadband 
reflectance (backscatter) illumination from the specimen being analyzed. The illustrative probe 
142 of Figures 2 and 3 includes a scanner assembly 180 that provides illumination from the UV 

5 source 1 60, for example, in a raster pattern over a target area of the specimen of cervical tissue to 
be analyzed. The probe 142 also includes a video camera 182 for observing and recording visual 
images of the specimen under analysis. The probe 142 also includes a targeting source 184 for 
determining where on the surface of the specimen to be analyzed the probe 142 is pointing. The 
probe 142 also includes white light optics 186 to deliver white light from the white light 

10 source(s) 162 for recording the reflectance data and to assist the operator in visualizing the 
specimen to be analyzed. Once the operator aligns the instrument 102 and depresses the 
footswitch 174, the computer 152 controls the actions of the light sources 160, 162, the coupling 
optics 170, 172, the transmission of light signals and electrical signals through the cable 144, the 
operation of the probe optics 178 and the scanner assembly 180, the retrieval of observed 

15 spectra, the coupling of the observed spectra into the spectrometer and detector 168 via the cable 
144, the operation of the spectrometer and detector 168, and the subsequent signal processing 
and analysis of the recorded spectra. 

[0252] Figure 4 depicts the probe 142 having top and bottom illumination sources 1 88, 1 90 
according to an illustrative embodiment of the invention. In this embodiment, the illumination 

20 sources 1 88, 1 90 are situated at an upper and a lower location about the perimeter of a probe 
head 192 such that there is illuminating light incident to a target area at each of two different 
angles. In one embodiment, the target area is a tissue sample. The probe head 192 contains 
probe optics 178 for illuminating regions of tissue and for collecting illumination reflected or 
otherwise emitted from regions of tissue. Illustratively, the probe optics for collecting the 

25 illumination 200 are located between the top and bottom illumination sources 1 88, 190. In other 
illustrative embodiments, other arrangements of the illuminating and collecting probe optics 178 
are used that allow the illumination of a given region of tissue with light incident to the region at 
more than one angle. One such arrangement includes the collecting optics 200 positioned around 
the Uluminating optics. 

30 [0253] In one illustrative embodiment, the top and bottom illumination sources 188, 190 are 
alternately turned on and off in order to sequentially illuminate the tissue at equal and opposite 
angles relative to the collection axis. For example, the top illumination source 1 88 is turned on 
while the bottom illumination source 190 is turned off, such that spectral measurements may be 
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obtained for light reflected from a region of the tissue sample 194 illuminated with light incident 
to the region at a first angle. This angle is relative to the surface of the tissue sample at a point 
on the region, for example. Then, the top illumination source 188 is turned off while the bottom 
illumination source 190 is turned on, such that spectral measurements may be obtained using 
5 light incident to the region at a second angle. If data obtained using one of the illumination 
sources is adversely affected by an artifact, such as glare or shadow, then data obtained using 
another illumination source, with light incident to the region at a different angle, may be 
unaffected by the artifact and may still be useful. The spectral measurements can include 
reflectance and/or fluorescence data obtained over a range of wavelengths. 

10 [0254] According to the various illustrative embodiments, the top and the bottom illumination 
sources 1 88, 190 may be alternately cycled on and off more than once while obtaining data for a 
given region. Also, cycles of the illumination sources 188, 190 may overlap, such that more than 
one illumination source is on at one time for at least part of the illumination collection procedure. 
Other illumination alternation schemes are possible, depending at least in part on the 

15 arrangement of illumination sources 1 88, 190 in relation to the probe head 1 92. 

[0255] After data are obtained from one region of the tissue using light incident to the region at 
more than one angle, data may likewise be obtained from another region of the tissue. In the 
illustrative embodiment of Figure 4, the scanner assembly 180 illuminates a target area of the 
tissue sample region-by-region. Illustratively, a first region is illuminated using light incident to 

20 the region at more than one angle as described above, then the probe optics 178 are automatically 
adjusted to repeat the illumination sequence at a different region within the target area of the 
tissue sample. The illustrative process is repeated until a desired subset of the target area has 
been scanned. As mentioned above, preferably about five hundred regions are scanned within a 
target area having a diameter of about 25-mm. Using the instrument 102, the scan of the 

25 aforementioned five hundred regions takes about 12 seconds. In other illustrative embodiments, 
the number of regions scanned, the size of the target area, and/or the duration of the scan vary 
from the above. 

[0256] Figure 5 depicts an exemplary scan pattern 202 used by the instrument 1 02 to obtain 
spatially-correlated spectral data and image data from a tissue sample according to an illustrative 
30 embodiment of the invention. Illustratively, spectral data are obtained at 499 regions of the 
tissue sample, plus one region out of the field of view of the cervix obtained, for example, for 
calibration purposes. The exemplary scan pattern 202 of Figure 5 includes 499 regions 204 
whose centers are inside a circle 206 that measures about 25.8 mm in diameter. The center of 
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each region is about LI mm away from each of the nearest surrounding regions. This may be 
achieved by offsetting each scan line by about 0.9527 mm in the y-direction and by staggering 
each scan line in the x-direction by about 0.55 mm. Each of the 499 regions is about 0.7 mm in 
diameter. In other illustrative embodiments, other geometries are used. 
5 [0257] According to the illustrative embodiment, the spectral data acquisition component 1 04 
of the system 100 depicted in Figure 1 is performed using the scan pattern 202 shown in Figure 
5. A fluorescence spectrum, two broadband reflectance spectra, and a reference spectrum are 
obtained at each region 204. The two broadband reflectance spectra use light incident to the 
sample at two different angles. A scan preferably begins at the center region 208, which 

10 corresponds to a pixel in a 500 x 480 pixel video image of the tissue sample at location 250, 240. 
As discussed in more detail below, a sequence of video images of the tissue sample may be taken 
during a scan of the 499 regions shown in Figure 5 and may be used to detect and compensate 
for movement of the tissue sample during the scan. The real-time tracker component 106 of the 
system 100 shown in Figure 1 performs this motion detection and compensation function. 

15 Preferably, the scanner assembly 180 of Figure 3 includes controls for keeping track of the data 
obtained, detecting a stalled scan process, aborting the scan if the tissue is exposed to 
temperature or light outside of acceptable ranges, and/or monitoring and reporting errors 
detected by the spectral data acquisition component 104 of the system of Figure 1. 
[0258] Figure 6 depicts front views of four exemplary arrangements 210, 212, 214, 216 of 

20 illumination sources about a probe head 1 92 according to various illustrative embodiments of the 
invention. The drawings are not to scale; they serve to illustrate exemplary relative 
arrangements of illumination sources about the perimeter of a probe head 192. Other 
arrangements include positioning collecting optics 200 around the perimeter of the probe head 
192, about the illumination sources, or in any other suitable location relative to the illumination 

25 sources. The first arrangement 210 of Figure 6 has one top illumination source 218 and one 

bottom illumination source 220, which are alternately cycled on and off as described above. The 
illumination sources are arranged about the collecting optics 200, which are located in the center 
of the probe head 192. Light from an illumination source is reflected from the tissue and 
captured by the collecting optics 200. 

30 [0259] The second arrangement 212 of Figure 6 is similar to the first arrangement 210, except 
that there are two illumination sources 222, 224 in the top half of the probe head 192 and two 
illununation sources 226, 228 in the bottom half of the probe head 192. In one embodiment, the 
two lights above the midline 230 are turned on and the two lights below the midline 230 are 
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turned off while obtaining a first set of spectral data; then the lights above the midline 230 are 
turned off and the lights below the midline 230 are turned on while obtaining a second set of 
spectral data. In an alternate illustrative embodiment, only one of the four illumination sources 
are turned on at a time to obtain four sets of spectral data for a given region. Other illustrative 
5 embodiments include turning the illumination sources on and off in other patterns. Other 

alternative embodiments include using noncircular or otherwise differently shaped illumination 
sources, and/or using a different number of illumination sources. 

[0260] The third arrangement 214 of Figure 6 includes each illumination source 232, 234 
positioned on either side of the probe head 192. The sources 232, 234 may be alternated in a 

10 manner analogous to those described for the first arrangement 210. 

[0261] The fourth arrangement 216 of Figure 6 is similar to the second arrangement 212, 
except that the illumination sources 236, 238 on the right side of the probe head 192 are turned 
off and on together, alternately with the illumination sources 240, 242 on the left side of the 
probe head 192. Thus, two sets of spectral data may be obtained for a given region, one set 

15 using the illumination sources 236, 238 on the right of the midline 244, and the other set using 
the illumination sources 240, 242 on the left of the midline 244. 

[0262] Figure 7 depicts exemplary illumination of a region 250 of a tissue sample 194 using 
light incident to the region 250 at two different angles 252, 254 according to an illustrative 
embodiment of the invention. Figure 7 demonstrates that source light position may affect 

20 whether data is affected by glare. The probe head 192 of Figure 7 is depicted in a cut-away view 
for illustrative purposes. In this illustrative embodiment, the top illumination source 1 88 and 
bottom illumination source 190 are turned on sequentially and illuminate the surface of a tissue 
sample 194 at equal and opposite angles relative to the collection axis 256. Arrows represent the 
light emitted 252 from the top illumination source 1 88, and the light specularly reflected 258 

25 from the surface of the region 250 of the tissue sample 194. In preferred embodiments, it is 
desired to collect diffusely reflected light, as opposed to specularly reflected light 258 (glare). 
Since the specularly reflected light 258 from the top illumination source 188 does not enter the 
collecting optics 200 in the example illustrated in Figure 7, a set of data obtained using the top 
illumination source 188 would not be affected by glare. 

30 [0263] However, in the example illustrated in Figure 7, the emitted light 254 from the bottom 
illumination source 190 reaches the surface of the region 250 of the tissue 194 and is specularly 
reflected into the collecting optics 200, shown by the arrow 260. Data obtained using the bottom 
illumination source 190 in the example pictured in Figure 7 would be affected by glare. This 
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data may not be useful, for example, in determining a characteristic or a condition of the region 
250 of the tissue 194. In this example, it would be advantageous to instead use the set of data 
obtained using the top illumination source 188 since it is not affected by glare. 
[0264] The position of the collection optics 200 may affect whether or not data is affected by 

5 glare. For example, light 252 with illumination intensity I 0 (X) strikes a tissue surface at a given 
region 250. A fraction of the initial illumination intensity, <xI 0 (X), is specularly reflected from the 
surface 258, where a is a real number between 0 and 1 . An acceptance cone 268 is the space 
through which light is diffusely reflected from the tissue 194 into the collecting optics 200, in 
this embodiment. Light may also be emitted or otherwise transmitted from the surface of the 

10 tissue. The diffusely reflected light is of interest, since spectral data obtained from diffusely 
reflected light can be used to determine the condition of the region of the sample. If there is no 
specular reflection within the acceptance cone 268, only diffusely reflected light is collected, and 
the collected signal corresponds to I t (X), where I t (X) is the intensity of light diffusely reflected 
from the region 250 on the surface of the tissue. 

15 [0265] If the collection optics 200 are off-center, light incident to the tissue surface may 

specularly reflect within the acceptance cone 268. For example, light with illumination intensity 
l 0 (X) strikes the surface of the tissue. Light with a fraction of the initial illumination intensity, 
aI 0 (X), from a given source is specularly reflected from the surface 266, where a is a real number 
between 0 and 1 . Where there is specular reflection of light within the acceptance cone 268, 

20 both diffusely reflected light and specularly reflected light reach the collecting optics 200. Thus, 
the collected signal corresponds to an intensity represented by the sum I t (X) + aI 0 (A,). It may be 
difficult or impossible to separate the two components of the measured intensity, thus, the data 
may not be helpful in determining the condition of the region of the tissue sample due to the 
glare effect. 

25 [0266] Figure 8 is a diagram 284 depicting illumination of a region 250 of a cervical tissue 
sample 194 using a probe 142 and a vaginal speculum 286 according to an illustrative 
embodiment of the invention. Here, the illuminating light incident to the tissue sample 194, is 
depicted by the upper and lower intersecting cones 196, 198. In a preferred embodiment, the 
probe 142 operates without physically contacting the tissue being analyzed. In one embodiment, 

30 a disposable sheath 146 is used to cover the probe head 192, for example, in case of incidental 
contact of the probe head 1 92 with the patient's body. Figure 9 is a schematic representation of 
an accessory device 290 that forms at least part of the disposable sheath 146 for a probe head 
192 according to an illustrative embodiment of the invention. In one illustrative embodiment, 
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the entire sheath 146, including the accessory device 290, if present, is disposed of after a single 
use on a patient. As shown in Figure 8, in one illustrative embodiment, the disposable sheath 
146 and/or the accessory device 290 have a unique identifier, such as a two-dimensional bar 
code 292. According to an illustrative feature, the accessory device 290 is configured to provide 
5 an optimal light path between the optical probe 142 and the target tissue 194. Optional optical 
elements in the accessory device 290 may be used to enhance the light transmitting and light 
receiving functions of the probe 142. 

[0267] Although an illustrative embodiment of the invention is described herein with respect 
to analysis of vaginal tissue, other tissue types may be analyzed using these methods, including, 
10 for example, colorectal, gastroesophageal, urinary bladder, lung, skin tissue, and/or any tissue 
comprising epithelial cells. 

Spectral calibration - 1 10. 1 12, 1 16 
[0268] Figure 10 is a block diagram 300 featuring components of the tissue characterization 
system 100 of Figure 1 that involve spectral data calibration and correction, according to an 

15 illustrative embodiment of the invention. The instrument 102 of Figure 1 is calibrated at the 
factory, prior to field use, and may also be calibrated at regular intervals via routine preventive 
maintenance (PM). This is referred to as factory and/or preventive maintenance calibration 110. 
Additionally, calibration is performed immediately prior to each patient scan to account for 
temporal and/or intra-patient sources of variability. This is referred to as pre-patient calibration 

20 116. The illustrative embodiment includes calibrating one or more elements of the instrument 
102, such as the spectrometer and detector 168 depicted in Figure 3. 

[0269] Calibration includes performing tests to adjust individual instrument response and/or to 
provide corrections accounting for individual instrument variability and/or individual test 
(temporal) variability. During calibration procedures, data is obtained for the pre-processing of 

25 raw spectral data from a patient scan. The tissue classification system 100 of Figure 1 includes 
determining corrections based on the factory and/or preventive maintenance calibration tests, 
indicated by block 112 in Figure 10 and in Figure 1. Where multiple sets of factory and/or 
preventive maintenance (PM) data exists, the most recent set of data is generally used to 
determine correction factors and to pre-process spectral data from a patient scan. Corrections are 

30 also determined based on pre-patient calibration tests, indicated by block 1 1 8 of Figure 10. The 
correction factors are used, at least indirectly, in the pre-processing (114, Figure 1) of 
fluorescence and reflectance spectral data obtained using a UV light source and two white light 
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sources. Block 1 14 of Figure 1 1 corresponds to the pre-processing of spectral data in the overall 
tissue classification system 100 of Figure 1, and is further discussed herein. 
[0270] Calibration accounts for sources of individual instrument variability and individual test 
variability in the preprocessing of raw spectral data from a patient scan. Sources of instrument 
5 and individual test variability include, for example, external light (light originating outside the 
instrument 102, such as room light) and internal stray light. Internal stray light is due at least in 
part to internal "cross talk," or interaction between transmitted light and the collection optics 
200. Calibration also accounts for the electronic background signal read by the instrument 102 
when no light sources, internal or external, are in use. Additionally, calibration accounts for 
10 variations in the amount of light energy delivered to a tissue sample during a scan, spatial 

inhomogeneities of the illumination source(s), chromatic aberration due to the scanning optics, 
variation in the wavelength response of the collection optics 200, and/or the efficiency of the 
collection optics 200rfor example, as well as other effects. 

[0271] In the illustrative embodiment of Figure 10, factory and preventive maintenance 
15 calibration tests are performed to determine correction factors 1 12 to apply to raw fluorescence 

and reflectance spectral data obtained during patient scans. The factory/preventive maintenance 

calibration tests 110 include a wavelength calibration test 302, a "null" target test 304, a 

fluorescent dye cuvette test 306, a tungsten source test 308, an "open air" target test 3 10, a 

customized target test 312, and aNIST standard target test 314. 
20 [0272] The wavelength calibration test 302 uses mercury and argon spectra to convert a CCD 

pixel index to wavelengths (urn). A wavelength calibration and interpolation method using data 

from the mercury and argon calibration test 302 is described below. 

[0273] The null target test 304 employs a target having about 0% diffuse reflectivity and is 
used along with other test results to account for internal stray light. Data from the factory/PM 

25 null target test 304 are used to determine the three correction factors shown in block 3 1 6 for 
fluorescence spectral measurements (F) obtained using a UV light source, and broadband 
reflectance measurements (BB1, BB2) obtained using each of two white light sources. In one 
embodiment, these three correction factors 316 are used in determining correction factors for 
other tests, including the factory/PM fluorescent dye cuvette test 306, the factory/PM open air 

30 target test 3 1 0, the factory/PM customized target test 3 12, and the factory/PM NIST standard 
target test 3 1 4. The open air target test 3 1 0, the customized target test 3 1 2, and the NIST 
standard target test 314 are used along with the null target test 304 to correct for internal stray 
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light in spectral measurements obtained using a UV light source and one or more white light 
sources. 

[0274] The open air target test 3 1 0 is performed without a target and in the absence of external 
light (all room lights turned off). The customized target test 312 employs a custom-designed 
5 target including a material of approximately 1 0% diffuse reflectivity and is performed in the 
absence of external light. The custom-designed target also contains phosphorescent and 
fluorescent plugs that are used during instrument focusing and target focus validation 122. In 
one embodiment, the custom-designed target is also used during pre-patient calibration testing 
(1 16, 330) to monitor the stability of fluorescence readings between preventive maintenance 

10 procedures and/or to align an ultraviolet (UV) light source 160 - for example, a nitrogen laser or 
a frequency-tripled Nd: YAG laser. The NIST (U.S. National Institute of Standards and 
Technology) standard target test 3 14 employs a NIST-standard target comprising a material of 
approximately 60% diffuse reflectivity and is performed in the absence of external light 
Correction factors determined from the "open air" target test 3 1 0, the custom target test 3 12, and 

1 5 the NIST-standard target test 3 1 4 are shown in blocks 322, 324, and 326 of Figure 1 0, 
respectively. The correction factors are discussed in more detail below. 

[0275] The fluorescent dye cuvette test 306 accounts for the efficiency of the collection optics 
200 of a given unit. The illustrative embodiment uses data from the fluorescent dye cuvette test 
306 to determine a scalar correction factor 318 for fluorescence measurements (F) obtained using 
20 a UV light source. The tungsten source test 308 uses a quartz-tungsten-halogen lamp to account 
for the wavelength response of the fluorescence collection optics 200, and data from this test are 
used to determine a correction factor 320 for fluorescence measurements (F) obtained using a 
UV light source. 

[0276] In addition to factory and preventive maintenance calibration 110, pre-patient 
25 calibration 1 16 is performed immediately before each patient scan. The pre-patient calibration 
116 includes performing a null target test 328 and a customized target test 330 before each 
patient scan. These tests are similar to the factory/PM null target test 304 and the factory/PM 
custom target test 3 12, except that they are each performed under exam room conditions 
immediately before a patient scan is conducted. The correction factors shown in blocks 332 and 
30 334 of Figure 1 0 are determined from the results of the pre-patient calibration tests. Here, 

correction factors (316, 322) from the factory/PM null target test 304 and the factory/PM open 
air test 3 10 are used along with pre-patient calibration data to determine the pre-patient 



WO 2004/005895 



PCT/US2003/021347 



-51- 

correction factors 118, which are used, in turn, to pre-process raw spectral data from a patient 
scan, as shown, for example, in Figure 11. 

[0277] Figure 1 1 is a block diagram 340 featuring the spectral data pre-processing component 
1 14 of the tissue characterization system 100 of Figure 1 according to an illustrative embodiment 
5 of the invention. In Figure 1 1, "F" represents the fluorescence data obtained using the UV light 
source 160, "BB1" represents the broadband reflectance data obtained using the first 188 of the 
two white light sources 162 and tc BB2" represents the broadband reflectance data obtained using 
the second 190 of the two white light sources 162. Blocks 342 and 344 indicate steps undertaken 
in pre-processing raw reflectance data obtained from the tissue using each of the two white light 
10 sources 188, 190, respectively. Block 346 indicates steps undertaken in pre-processing raw 
fluorescence data obtained from the tissue using the UV light source 160. These steps are 
discussed in more detail below. 

[0278] The instrument 102 detailed in Figure 3 features a scanner assembly 180 which 
includes a CCD (charge couple device) detector and spectrograph for collecting fluorescence and 
15 reflectance spectra from tissue samples. Because a CCD detector is used, the system employs a 
calibration procedure to convert a pixel index into wavelength units. Referring to Figure 10, the 
pixel-to- wavelength calibration 302 is performed as part of factory and/or preventive 
maintenance calibration procedures 110. 

[0279] In the illustrative embodiment, the tissue classification system 100 uses spectral data 
20 obtained at wavelengths within a range from about 360 nm to about 720 nm. Thus, the pixel-to- 
wavelength calibration procedure 302 uses source light that produces peaks near and/or within 
the 360 nm to 720 nm range. A mercury lamp produces distinct, usable peaks between about 
365 nm and about 578 nm, and an argon lamp produces distinct, usable peaks between about 697 
nm and about 740 nm. Thus, the illustrative embodiment uses mercury and argon emission 
25 spectra to convert a pixel index from a CCD detector into units of wavelength (nm). 

[0280] First, a low-pressure pen-lamp style mercury lamp is used as source light, and intensity 
is plotted as a function of pixel index. The pixel indices of the five largest peaks are correlated 
to ideal, standard Hg peak positions in units of nanometers. Second, a pen-lamp style argon 
lamp is used as source light and intensity is plotted as a function of pixel index. The two largest 
30 peaks are correlated to ideal, standard Ar peak positions in units of nanometers. 

[0281] The seven total peaks provide a set of representative peaks well-distributed within a 
range from about 365 nm to about 738 nm - comparable to the range from about 360 nm to 
about 720 nm that is used for data analysis in the tissue classification system 100. The 
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calibration procedure in block 302 of Figure 10 includes retrieving the following spectra: a 
spectrum using a mercury lamp as light source, a mercury background spectrum (a spectrum 
obtained with the mercury source light turned off), a spectrum using an argon lamp as light 
source, and an argon background spectrum. The respective Hg and Ar background spectra are 
subtracted from the Hg and Ar spectra, producing the background-corrected Hg and Ar spectra. 
The spectra are essentially noise-free and require no smoothing. Each of the seven pixel values 
corresponding to the seven peaks above are determined by finding the centroid of the curve of 
each peak over a +/- 5 pixel range of the maximum as shown in Equation 1 : 



where p is pixel value, I p is the intensity at pixel p, and p max is the pixel value corresponding to 
each peak maximum." From the p max determinations, a polynomial function correlating pixel 
value to wavelength value is determined by performing a least-squares fit of the peak data. In 
one embodiment, the polynomial function is of fourth order. In alternative embodiments, the 
polynomial is of first order, second order, third order, fifth order, or higher order. 
[0282] Alternatively to finding p max by determining the centroid as discussed above, in another 
illustrative embodiment the pixel-to-wavelength calibration procedure 302 includes fitting a 
second order polynomial to the signal intensity versus pixel index data for each of the seven 
peaks around the maximum +/- 3 pixels (range including 7 pixels); taking the derivative of the 
second order polynomial; and finding the y-intercept to determine each p max . 
[0283] The resulting polynomial function correlating pixel value to wavelength value is 
validated, for example, by specifying that the maximum argon peak be located within a given 
pixel range, such as [300:340] and/or that the intensity count at the peak be within a reasonable 
range, such as between 3000 and 32,000 counts. Additionally, the maximum mercury peak is 
validated to be between pixel 150 and 225 and to produce an intensity count between 3000 and 
32,000 counts. Next, the maximum difference between any peak wavelength predicted by the 
polynomial function and its corresponding ideal (reference) peak is required to be within about 
1 .0 nm. Alternatively, other validation criteria may be set. 

[0284] Additional validation procedures may be performed to compare calibration results 
obtained for different units, as well as stability of calibration results over time. In one illustrative 
embodiment, the pixel-to- wavelength calibration 302 and/or validation is performed as part of 
routine preventive maintenance procedures. 



centroid 
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[0285] Since fluorescence and reflectance spectral data that are used as reference data in the 
classification system 100 may be obtained at multiple clinical sites with different individual 
instruments, the illustrative system 100 standardizes spectral data in step 302 of Figure 10 by 
determining and using values of spectral intensity only at designated values of wavelength. 

5 Spectral intensity values are standardized by interpolating pixel-based intensities such that they 
correspond to wavelengths that are spaced every 1 nm between about 360 nm and about 720 nm. 
This may be done by linear interpolation of the pixel-based fluorescence and/or reflectance 
values. Other illustrative embodiments use, for example, a cubic spline interpolation procedure 
instead of linear interpolation. 

10 [0286] In some illustrative embodiments, spectral data acquisition during patient scans and 
during the calibration procedures of Figure 10 includes the use of a CCD array as part of the 
scanner assembly 180 depicted in Figure 3. The CCD array may contain any number of pixels 
corresponding to data*bbtained at a given time and at a given interrogation point. In one 
embodiment, the CCD array contains about 532 pixels, including unused leading pixels from 

15 index 0 to 9, relevant data from index 10 to 400, a power monitor region from index 401 to 521, 
and unused trailing pixels from index 522 to 531 . One embodiment includes "power correcting" 
or "power monitor correcting" by scaling raw reflectance and/or fluorescence intensity 
measurements received from a region of a tissue sample with a measure of the intensity of light 
transmitted to the region of the tissue sample. In order to provide the scaling factor, the 

20 instrument 102 directs a portion of a light beam onto the CCD array, for example, at pixel 
indices 401 to 521, and integrates intensity readings over this portion of the array. 
[0287] In one preferred embodiment, both factory/PM 1 10 and pre-patient 116 calibration 
accounts for chromatic, spatial, and temporal variability caused by system interference due to 
external stray light, internal stray light, and electronic background signals. External stray light 

25 originates from sources external to the instrument 102, for example, examination room lights 
and/or a colposcope light. The occurrence and intensity of the effect of external stray light on 
spectral data is variable and depends on patient parameters and the operator's use of the 
instrument 102. For example, as shown in Figure 8, the farther the probe head 192 rests from the 
speculum 286 in the examination of cervical tissue, the greater the opportunity for room light to 

30 be present on the cervix. The configuration and location of a disposable component 146 on the 
probe head 192 also affects external stray light that reaches a tissue sample. Additionally, if the 
operator forgets to turn off the colposcope light before taking a spectral scan, there is a chance 
that light will be incident on the cervix and affect spectral data obtained. 
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[0288] Electronic background signals are signals read from the CCD array when no light 
sources, internal or external, are in use. According to the illustrative embodiment, for all 
components of the tissue characterization system 100 that involve obtaining and/or using spectral 
data, including components 1 10, 1 16, 104, and 1 14 of Figure 1, both external stray light and 

5 electronic background signals are taken into account by means of a background reading. For 
each interrogation point in a spectral scan in which one or more internal light sources are used, a 
background reading is obtained in which all internal light sources (for example, the Xenon lamps 
and the UV laser) are turned off. According to one feature, the background reading immediately 
precedes the fluorescence and broadband reflectance measurements at each scan location, and 

10 the system 100 corrects for external stray light and electronic background by subtracting the 
background reading from the corresponding spectral reading at a given interrogation point. In 
Figure 10, each calibration test - including 304, 306, 308, 310, 312, 314, 328, and 330 - includes 
obtaining a background reading at each interrogation point and subtracting it from the test 
reading to account for external stray light and electronic background signals. Also, background 

15 subtraction is a step in the spectral data preprocessing 1 14 methods in Figure 1 1, for the pre- 
processing of raw BB1 and BB2 reflectance data 342, 344 as well as the pre-processing of raw 
fluorescence data 346. 

[0289] Equation 2 shows the background correction for a generic spectral measurement from a 
tissue sample, Sussu^isl+esl+ebOA) : 

20 S tissueH stXi,k) = SussueflSL+ESL+EBCi,^) ~ BkEB4£SlXl A) P) 

where i corresponds to a scan location; X is wavelength or its pixel index equivalent; and 
subscripts denote influences on the spectral measurement - where "tissue" represents the tissue 
sample, "ISL" represents internal stray light (internal to the instrument 102), "ESL" represents 
external stray light, and "EB" represents electronic background. S tissuc+IS L + ESL+EB(iA) is a two- 

25 dimensional array (which may be power-monitor corrected) of spectral data obtained from the 
tissue at each interrogation point (region) i as a function of wavelength X; and B^esiX^) is a 
two-dimensional array representing values of the corresponding background spectral readings at 
each point i as a function of wavelength X. S tissue+lsL (i 5 l) is the background-subtracted spectral 
array that is thereby corrected for effects of electronic background (EB) and external stray light 

30 (ESL) on the spectral data from the tissue sample. The electronic background reading is 
subtracted on a wavelength-by-wavelength, location-by-location basis. Subtracting the 
background reading generally does not correct for internal stray light (ISL), as denoted in the 
subscript of S tissueHSL (i,X). 
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[0290] Internal stray light includes internal cross talk and interaction between the transmitted 
light within the system and the collection optics. For fluorescence measurements, a primary 
source of internal stray light is low-level fluorescence of optics internal to the probe 142 and the 
disposable component 146. For reflectance measurements, a primary source of internal stray 
5 light is light reflected off of the disposable 146 and surfaces in the probe 142 that is collected 
through the collection optics 200. The positioning of the disposable 146 can contribute to the 
effect of internal stray light on reflectance measurements. For example, the internal stray light 
effect may vary over interrogation points of a tissue sample scan in a non-random, identifiable 
pattern due to the position of the disposable during the test. 

10 [0291] According to the illustrative embodiment of Figure 10, the factory/PM null target test 
304, the factory/PM open air target test 306, the factory/PM custom target test 3 12, the 
factory/PM NIST target test 314, the pre-patient null target test 328, and the pre-patient custom 
target test 330 provide correction factors to account for internal stray light effects on 
fluorescence and reflectance spectral measurements. In an alternative illustrative embodiment, a 

15 subset of these tests is used to account for internal stray light effects. 

[0292] The null target test 304, 328, performed in factory/preventive maintenance 1 1 0, and 
pre-patient 1 16 calibration procedures, uses a target that has a theoretical diffuse reflectance of 
0%, although the actual value may be higher. Since, at least theoretically, no light is reflected by 
the target, the contribution of internal stray light can be measured for a given internal light 

20 source by obtaining a spectrum from a region or series of regions of the null target with the 
internal light source turned on, obtaining a background spectrum from the null target with the 
internal light source turned off, and background-subtracting to remove any effect of electronic 
background signal or external stray light. The background-subtracted reading is then a measure 
of internal stray light. The pre-patient null target test 328 takes into account spatially-dependent 

25 internal stray light artifacts induced by the position of a disposable 146, as well as temporal 
variability induced, for example, by the aging of the instrument and/or dust accumulation. In 
one embodiment, the factory/PM null target test 304 is used in calculating correction factors 
from other factory and/or preventive maintenance calibration procedures. The null target tests 
304, 328 are not perfect, and improved measurements of the effect of internal stray light on 

30 spectral data can be achieved by performing additional tests. 

[0293] The open air target test 3 1 0 is part of the factory preventive maintenance (PM) 
calibration procedure 110 of Figure 10 and provides a complement to the null target tests 304, 
328. Hie open air target test 3 1 0 obtains data in the absence of a target with the internal light 
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sources turned on and all light sources external to the device turned off, for example, in a 
darkroom. The null target test 304, by contrast, does not have to be performed in a darkroom 
since it uses a target in place in the calibration port, thereby sealing the instrument such that 
measurements of light from the target are not affected by external light. Although a disposable 
5 1 46 is in place during open air test measurements, the factory/PM open air target test 310 does 
not account for any differences due to different disposables used in each patient run. The open 
air measurements are important in some embodiments, however, since they are performed under 
more controlled conditions than pre-patient calibration tests 116, for example, the open air tests 
may be performed in a darkroom. Also, the factory/PM calibration 1 1 0 measurements account 
10 for differences between individual instruments 102, as well as the effects of machine aging - 
both important factors since reference data obtained by any number of individual instruments 
102 are standardized for use in a tissue classification algorithm, such as the one depicted in block 
132 of Figure 1. 

[0294] Figures 12, 13, 14, and 15 show graphs demonstrating mean background-subtracted, 
15 power-monitor-corrected intensity readings from a factory open air target test 310 and a null 
target test 304 using a BB 1 reflectance white light source and a UV light source (laser). Figure 
12 shows a graph 364 of mean intensity 366 from an open air target test over a set of regions as a 
function of wavelength 368 using a BB1 reflectance white light source 188 — the "top" source 
188 as depicted in Figures 4, 7, and 8. Figure 13 shows a graph 372 of mean intensity 366 from 
20 a null target test over the set of regions as a function of wavelength 368 using the same BB1 light 
source. Curves 370 and 374 are comparable but there are some differences. 
[0295] Figure 14 shows a graph 376 of mean intensity 378 from an open air target test over a 
set of regions as a function of wavelength 380 using a UV light source, while Figure 15 shows a 
graph 384 of mean intensity 378 from a null target test over the set of regions as a function of 
25 wavelength 380 using the UV light source. Again, curves 382 and 386 are comparable, but there 
are some differences between them. Differences between the open air test intensity and null 
target test intensity are generally less than 0.1% for reflectance data and under 1 count/|iJ for 
fluorescence data. 

[0296] Accounting for internal stray light is more complicated for reflectance measurements 
30 than for fluorescence measurements due to an increased spatial dependence. The open air target 
test measurement, in particular, has a spatial profile that is dependent on the position of the 
disposable. 
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[0297] Figure 16 shows a representation 390 of regions of an exemplary scan performed in a 
factory open air target test. The representation 390, shows that broadband intensity readings can 
vary in a non-random, spatially-dependent manner. Other exemplary scans performed in factory 
open air target tests show a more randomized, less spatially-dependent variation of intensity 
5 readings than the scan shown in Figure 16. 

[0298] According to the illustrative embodiment, the system 100 of Figure 1 accounts for 
internal stray light by using a combination of the results of one or more open air target tests 310 
with one or more null target tests 304, 328. In an alternative embodiment, open air target test 
data is not used at all to correct for internal stray light, pre-patient null target test data being used 
10 instead. 

[0299] Where open air and null target test results are combined, it is helpful to avoid 
compounding noise effects from the tests. Figure 17 shows a graph 402 depicting as a function 
of wavelength 406 thS ratio 404 of the background-corrected, power-monitor-corrected 
reflectance spectral intensity at a given region using an open air target to the reflectance spectral 

15 intensity at the region using a null target according to an illustrative embodiment of the 

invention. The raw data 407 is shown in Figure 17 fit with a second-order polynomial 412, and 
fit with a third-order polynomial without filtering 41 0, and with filtering 408. As seen by the 
differences between curve 407 and curves 408, 410, and 412, where a ratio of open air target 
data and null target data are used to correct for internal stray light in reflectance measurements, a 

20 curve fit of the raw data reduces the effect of noise. This is shown in more detail herein with 
respect to the calculation of pre-patient corrections 1 18 in Figure 10. Also evident in Figure 17 
is that the open air measurement generally differs from the null target measurement, since the 
ratio 404 is not equal to 1 , and since the ratio 404 has a distinct wavelength dependence. 
[0300] Figure 18 shows a graph 414 depicting as a function of wavelength 418 the ratio 416 of 

25 fluorescence spectral intensity using an open air target to the fluorescence spectral intensity 
using a null target according to an illustrative embodiment of the invention. The raw data 420 
does not display a clear wavelength dependence, except that noise increases at higher 
wavelengths. A mean 422 based on the ratio data 420 over a range of wavelengths is plotted in 
Figure 18. Where a ratio of open air target to null target data is used to correct for internal stray 

30 light in fluorescence measurements, using a mean value calculated from raw data over a stable 
range of wavelength reduces noise and does not ignore any clear wavelength dependence. 
[0301] Figure 1 0 shows correction factors corresponding to open air 3 1 0 and null target 3 04, 
328 calibration tests in one embodiment that compensates spectral measurements for internal 
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stray light effects. There are three types of spectral measurements in Figure 10 - fluorescence 
(F) measurements and two reflectance measurements (BB1, BB2) corresponding to data obtained 
using a UV light source and two different white light sources, respectively. The corrections in 
blocks 316, 322, and 332 come from the results of the factory/PM null target test 304, the 
5 factory/PM open air target test 310, and the pre-patient null target test 328, respectively, and 
these correction factors are applied in spectral data pre-processing (Figure 1 1) to compensate for 
the effects of internal stray light. These correction factors are described below in terms of this 
embodiment. 

[0302] Block 3 1 6 in Figure 1 0 contains correction factors computed from the results of the null 
10 target test 304, performed during factory and/or preventive maintenance (PM) calibration. The 
null target test includes obtaining a one-dimensional array of mean values of spectral data from 
each channel — F, BB1, and BB2 - corresponding to the three different light sources, as shown 
in Equations 3, 4, and* 5: 



where I nt refers to a background-subtracted, power-monitor-corrected two-dimensional array of 
spectral intensity values; subscript F refers to intensity data obtained using the fluorescence UV 
light source; subscripts BB1 and BB2 refer to intensity data obtained using the reflectance BB1 

20 and BB2 white light sources, respectively; i refers to interrogation point "i" on the calibration 
target; X refers to a wavelength at which an intensity measurement corresponds or its 
approximate pixel index equivalent; t, refers to the fact the measurement is obtained from a 
factory or preventive maintenance test, the "time" the measurement is made; and < ) s represents a 
one-dimensional array (spectrum) of mean values computed on a pixel-by-pixel basis for each 

25 interrogation point, i. In this embodiment, a one-dimensional array (spectrum) of fluorescence 
values corresponding to wavelengths from X = 370 nm to X = 720 nm is obtained at each of 499 
interrogation points, i. An exemplary scan pattern 202 of 499 interrogation points appears in 
Figure 5. In the illustrative embodiment, data from an additional interrogation point is obtained 
from a region outside the target 206. Each of the reflectance intensity spectra is obtained over 

30 the same wavelength range as the fluorescence intensity spectra, but the BB1 data is obtained at 
each of 250 interrogation points over the bottom half of the target and the BB2 data is obtained 
at each of 249 interrogation points over the top half of the target. This avoids a shadowing effect 
due to the angle at which the light from each source strikes the target during the null target test 
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304. Values of the most recent factory or preventive maintenance calibration test, including the 
factory/PM null target test 304, are used in spectral data pre-processing (Figure 1 1) for each 
patient scan. 

[0303] The pre-patient null target test, shown in block 328 of Figure 10, is similar to the 
5 factory/PM null target test 304, except that it is performed just prior to each patient test scan. 
Each pre-patient null target test 328 produces three arrays of spectral data as shown below: 

UfcM') (6) 

Io,BBlW) (7) 

WiV) (8) 

10 where f refers to the fact the measurements are obtained just prior to the test patient scan, as 
opposed to during factory/PM testing (to). 

[0304] Block 332 in Figure 10 contains correction factors from the open air target test 310, 
preformed during factory and/or preventive maintenance (PM) calibration 110. The open air 
target test is performed with the disposable in place, in the absence of a target, with the internal 
15 light sources turned on, and with all light sources external to the device turned off. The open air 
target test 3 1 0 includes obtaining an array of spectral data values from each of the three channels 



- F, BB1, and BB2 - as shown below: 

UiOU) (9) 

(10) 

20 Wi,U) (11) 



[0305] In each of items 9, 10, and 1 1 above, I^ refers to a background-subtracted, power- 
monitor-corrected array of spectral intensity values; i runs from interrogation points 1 to 499; 
and X runs from 370 nm to 720 nm (or the approximate pixel index equivalent). 
[0306] According to the illustrative embodiment, correction for internal stray light makes use 
25 of both null target test results and open air target test results. Correction factors in block 322 of 
Figure 10 use results from the factory/PM null target test 304 and factory/PM open air target test 



310. The correction factors in block 322 are computed as follows: 

sFCOFL = [ / (LAU))i W = 375 nm to 470 nm (12) 

FCOBB1 = fitted form of (UM / <Wi,Mo)>i (13) 

30 FCOBB2 = fitted form of Q^fiMfii / <W0M)>i (14) 



where < >, represents a spectrum (1-dimensional array) of mean values computed on a pixel-by- 
pixel basis for each interrogation point i, and where < >, / < ) ; represents a spectrum (1 - 
dimensional array) of quotients (ratios of means) computed on a pixel-by-pixel basis for each 
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interrogation point i. The correction factor sFCOFL in Equation 12 is a scalar quantity 
representing the mean value of the 1 -dimensional array in brackets [ ] across pixel indices 
corresponding to the wavelength range of about 375 nm to about 470 nm. 
[0307] Figure 1 8 shows an example value of sFCOFL 422 evaluated using a set of mean open 
5 air spectral data and mean null target spectral data. Large oscillations are damped by using the 
mean in Equation 12. Other wavelength ranges can be chosen instead of the wavelength range of 
about 375 nm to about 470 nm. 

[0308] The one-dimensional arrays, FCOBB1 and FCOBB2, are obtained by curve-fitting the 
spectra of quotients in Equations 13 and 14 with second-order polynomials and determining 
10 values of the curve fit corresponding to each pixel. Figure 17 shows an example curve fit for 
FCOBB1 (412). Unlike the fluorescence measurements, there is wavelength dependence of this 
ratio, and a curve fit is used to properly reflect this wavelength dependence without introducing 
excessive noise in following computations. 

[0309] Block 332 in Figure 10 contains correction factors using results from the pre-patient 
15 null target test 328, as well as the most recent factory/PM null target test 304 and open air target 
test 310. The correction factors in block 332 are computed as follows: 

SLFL = sFCOFL • CWU^f))i (15) ' 

SLBB1 = FCOBB1 ■ CUwW )>i (16) 
SLBB2 = FCOBB2 ■ <Wi^t'))i (17) 
20 where Equation 1 5 represents multiplying each value in the fluorescence mean pre-patient null 
target spectrum by the scalar quantity sFCOFL from Equation 12; Equation 16 represents 
multiplying corresponding elements of the mean pre-patient null target BB1 spectrum and the 
one-dimensional array FCOBB1 from Equation 13; and Equation 17 represents multiplying 
corresponding elements of the mean pre-patient null target BB2 spectrum and the one- 
25 dimensional array FCOBB2 from Equation 14. Each of SLFL, SLBB1, and SLBB2 is a one- 
dimensional array. 

[0310] The correction factors in block 332 of Figure 10 represent the contribution due to 
internal stray light (ISL) for a given set of spectral data obtained from a given patient scan. 



Combining equations above: 

30 SLFL = [ (WkU))* / (W(i,U)>i W^Sn^TOan, ' (WW)), (18) 

SLBB1 = [ <WiM»i / <Wi>U)>i W ' Onum W)>i (19) 

SLBB2 = [ <Wi,Mo)>i / (U(iM))i ]m« ' (WW (20) 
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[0311] Alternative internal stray light correction factors are possible. For example, in one 
alternative embodiment, the scalar quantity in Equation 18 is replaced with the value 1 .0. In one 
alternative embodiment, the first term on the right side of either or both of Equation 19 and 
Equation 20 is replaced with a scalar quantity, for example, a mean value or the value 1.0. 

5 [0312] Spectral data preprocessing 1 14 as detailed in Figure 1 1 includes compensating for 
internal stray light effects as measured by SLFL, SLBB1 and SLBB2. In one embodiment, a 
patient scan includes the acquisition at each interrogation point in a scan pattern (for example, 
the 499-point scan pattern 202 shown in Figure 5) of a set of raw fluorescence intensity data 
using the UV light source 160, a first set of raw broadband reflectance intensity data using a first 

10 white light source (162, 1 88), a second set of raw broadband reflectance intensity data using a 
second white light source (1 62, 192), and a set of raw background intensity data using no 
internal light source, where each set of raw data spans a CCD pixel index corresponding to a 
wavelength range between about 370 nm and 720 nm. In another embodiment, the wavelength 
range is from about 370 nm to about 700 nm. In another embodiment, the wavelength range is 

15 from about 300 nm to about 900 nm. Other embodiments include the use of different 
wavelength ranges. 

[0313] The raw background intensity data set is represented as the two-dimensional array 
Bkgnd[] in Figure 1 1 . Spectral data processing 114 includes subtracting the background array, 
BkgndQ, from each of the raw BB 1, BB2, and F arrays on a pixel-by-pixel and location-by- 

20 location basis. This accounts at least for electronic background and external stray light effects, 
and is shown as item #1 in each of blocks 342, 344, and 346 in Figure 1 1 . 
[0314] Also, each CCD array containing spectral data includes a portion for monitoring the 
power output by the light source used to obtain the spectral data. In one embodiment, the 
intensity values in this portion of each array are added or integrated to provide a one-dimensional 

25 array of scalar values, sPowerMonitor Q, shown in Figure 1 1 . Spectral data pre-processing 1 1 4 
further includes dividing each element of the background-subtracted arrays at a given 
interrogation point by the power monitor scalar correction factor in sPowerMonitor[] 
corresponding to the given interrogation point. This allows the expression of spectral data at a 
given wavelength as a ratio of received light intensity to transmitted light intensity. 

30 [0315] Spectral data pre-processing 1 14 further includes subtracting each of the stray light 
background arrays - SLBB1, SLBB2, and SLFL - from its corresponding background- 
corrected, power-monitor-corrected spectral data array - BB1, BB2, and F - on a pixel-by-pixel, 
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location-by-location basis. This accounts for chromatic, temporal, and spatial variability effects 
of internal stray light on the spectral data. 

[0316] The remaining steps in blocks 342 and 344 of the spectral data pre-processing block 
diagram 340 of Figure 1 1 include further factory, preventive maintenance (PM) and/or pre- 
5 patient calibration of reflectance (BB1 , BB2) measurements using one or more targets of known, 
non-zero diffuse reflectance. In the embodiment shown in Figure 10, this calibration uses results 
from the factory/PM custom target test 3 12, the factory/PM NIST-standaxd target test 3 14, and 
the pre-patient custom target test 330. These calibration tests provide correction factors as 
shown in blocks 324, 326, and 334 of Figure 10, that account for chromatic, temporal, and 

10 spatial sources of variation in broadband reflectance spectral measurements. These sources of 
variation include temporal fluctuations in the illumination source, spatial inhomogeneities in the 
illumination source, and chromatic aberration due to the scanning optics. The broadband 
reflectance calibration tests (3 12, 314, 330) also account for system artifacts attributable to both 
transmitted and received light, since these artifacts exist in both test reflectance measurements 

1 5 and known reference measurements. 

[0317] According to the illustrative embodiment, reflectance, R, computed from a set of 
regions of a test sample (a test scan) is expressed as in Equation 21 : 

R = [Measurement / Reference Target] • Reflectivity of Reference Target (21) 
where R, Measurement, and Reference Target refer to two-dimensional (wavelength, position) 

20 arrays of background-corrected, power-corrected and/or internal-stray-light-corrected reflectance 
data; Measurement contains data obtained from the test sample; Reference Target contains 
data obtained from the reference target; Reflectivity of Reference Target is a known scalar value; 
and division of the arrays is performed in a pixel-by-pixel, location-by-location manner. 
[0318] The factory/PM NIST target test 3 14 uses a 60%, NIST-traceable, spectrally flat diffuse 

25 reflectance target in the focal plane, aligned in the instrument 102 represented in Figure 3. The 
NIST target test 314 includes performing four scans, each of which proceed with the target at 
different rotational orientations, perpendicular to the optical axis of the system. For example, the 
target is rotated 90° from one scan to the next. The results of the four scans are averaged on a 
location-by-location, pixel-by-pixel basis to remove spatially-dependent target artifacts 

30 (speckling) and to reduce system noise. The goal is to create a spectrally clean (low noise) and 
spatially-flat data set for application to patient scan data. In one embodiment, the NIST target 
test 3 14 is performed only once, prior to instrument 102 use in the field (factory test), and thus, 
ideally, is temporally invariant. 
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[0319] The custom target tests 3 12, 330 use a custom-made target for both factory and/or 
preventive maintenance calibration, as well as pre-patient calibration of reflectance data. The 
custom target is a 10% diffuse reflective target with phosphorescent and/or fluorescent portions 
used, for example, to align the ultraviolet (UV) light source and/or to monitor the stability of 

5 fluorescence readings between preventive maintenance procedures. Figure 1 9 is a photograph of 
the custom target 426 according to an illustrative embodiment. In Figure 19, the target 426 
includes a portion 428 that is about 10% diffuse reflective material, with four phosphorescent 
plugs 430, 432, 434, 436 equally-spaced at the periphery and a single fluorescent plug 438 at the 
center. As a result of the plugs, not all scan locations in the scan pattern 202 of Figure 5, as 

10 applied to the custom target test 426, accurately measure the 10% reflective portion. Thus, a 
mask provides a means of filtering out the plug-influenced portions of the custom target 426 
during a custom target calibration scan 312, 330. 

[0320] Figure 20 is *a representation of such a mask 444 for the custom target reflectance 
calibration tests 312, 330. Area 445 in Figure 20 corresponds to regions of the custom target 426 

15 of Figure 19 that are not affected by the plugs 430, 432, 434, 436, and which, therefore, are 

usable in the custom target reflectance calibration tests 312, 330. Areas 446, 448, 450, 452, and 
454 of Figure 20 correspond to regions of the custom target 426 that are affected by the plugs, 
and which are masked out in the custom target calibration scan results. 
[0321] In the illustrative embodiment, the factory/PM NIST target test 3 1 4 provides 

20 reflectance calibration data for a measured signal from a test sample (patient scan), and the test 
sample signal is processed according to Equation 22: 

R(U,t') = [I m W) / I fc (i,U)] '0.6 . (22) 

[0322] Where R, I m , and I fc are two-dimensional arrays of background-corrected, power- 
corrected reflectance data; R contains reflectance intensity data from the test sample adjusted 

25 according to the reflectance calibration data; I ro contains reflectance intensity data from the 

sample, I fc contains reflectance intensity data from the factory/PM NIST-standard target test 314, 
and 0.6 is the known reflectivity of the NIST-standard target. Equation 22 presumes the spectral 
response of the illumination source is temporally invariant such that the factory calibration data 
from a given unit does not change with time, as shown in Equation 23 below: 

30 Irc(t') = Ifc(0 (23) 

However, the spectral lamp function of a xenon flash lamp, as used in the illustrative 
embodiment as the white light source 162 in the instrument 102 of Figure 3, is not invariant over 
time. 
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[0323] The illustrative reflectance data spectral preprocessing 1 14 accounts for temporal 
variance by obtaining pre-patient custom target test (330) reflectance calibration data and using 
the data to adjust data from a test sample, I ra , to produce adjusted reflectance R, as follows: 

RftM') = MM / 0,(U,t')>i 1-0.1 (24) 

5 where masked, mean reflectance intensity data from the pre-patient custom target test 330 with 
10% diffuse reflectivity, <I cp (i,A,,t')>i , replaces If C (i,A,,t') in Equation 22. Since the pre-patient 
custom target test data is updated before every patient exam, the temporal variance effect is 
diminished or eliminated. In other illustrative embodiments, various other reference targets may 
be used in place of the custom target 426 shown in Figure 19. 

10 [0324] The system 1 00 also accounts for spatial variability in the target reference tests of 
Figure 10 in pre-processing reflectance spectral data. Illustratively, spatial variability in 
reflectance calibration target intensity is dependent on wavelength, suggesting chromatic 
aberrations due to wavelength-dependence of transmission and/or collection optic efficiency. 
[0325] The illustrative reflectance data spectral preprocessing 114 accounts for these 

15 chromatic and spatial variability effects by obtaining reflectance calibration data and using the 
data to adjust data from a test sample, I m , to produce adjusted reflectance R, as follows: 

R(i,M') = [UU,t') / (IcpW')); ] ■ [<I fc (i,U)>i / I fc (i,U) ] " 0.1 (25) 
Equation 25 accounts for variations of the intensity response of the lamp by applying the pre- 
patient custom-target measurements - which are less dependent on differences caused by the 

20 disposable - in correcting patient test sample measurements. Equation 25 also accounts for the 
spatial response of the illumination source by applying the factory NIST-target measurements in 
correcting patient test sample measurements. 

[0326] In an alternative illustrative embodiment, the NIST-target test 3 14 is performed as part 
of pre-patient calibration 1 16 to produce calibration data, I fc (i,^,t')» and Equation 22 is used in 

25 processing test reflectance data, where the quantity I fc (i,X,t') replaces the quantity I fc (i,X»to) in 
Equation 22. According to this illustrative embodiment, the test data pre-processing procedure 
1 14 includes both factory/PM calibration 110 results and pre-patient calibration 1 16 results in 
order to maintain a more consistent basis for the accumulation and use of reference data from 
various individual units obtained at various times from various patients in a tissue 

30 characterization system. Thus, this illustrative embodiment uses Equation 26 below to adjust 
data from a test sample, I m , to produce adjusted reflectance R, as follows: 

R(u,r) = [i m (\xn i drc(ixto)i ] ■ [a fc (iM)>i / wlu) ] ■ o.e (26) 
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where the NIST-standard target test 314 is performed both as a factory/PM test 1 10 (to) and as a 
pre-patient test 1 1 6 (f). 

[0327] According to the illustrative embodiment, it is preferable to combine calibration 
standards with more than one target, each having a different diffuse reflectance, since calibration 
5 is not then tied to a single reference value. Here, processing using Equation 25 is preferable to 
Equation 26. Also, processing via Equation 25 may allow for an easier pre-patient procedure, 
since the custom target combines functions for both fluorescence and reflectance system set-up, 
avoiding the need for an additional target test procedure. 

[0328] Values of the custom target reflectance in a given individual instrument 1 02 vary over 
10 time and as a function of wavelength. For example, Figure 21 shows a graph 458 depicting as a 
function of wavelength 462 a measure of the mean reflectivity 460, R^, of the 10% diffuse target 
426 of Figure 19 over the non-masked regions 445 shown in Figure 20, obtained using the same 
instrument on two different days. is calculated as shown in Equation 27: 

K(V = [ OAU^ / <Ir c (i,U)>i ] ■ Rrc (27) 
15 where Rf C = 0.6, the diffuse reflectance of the NIST-traceable standard target. Values of vary 
as a function of wavelength 462, as seen in each of curves 464 and 466 of Figure 21 . Also, there 
is a shift from curve 464 to curve 466, each obtained on a different day. Similarly, values of Rc P 
vary among different instrument units. Curves 464 and 466 show that R^ varies with wavelength 
and varies from 0.1; thus, assuming R^ = 0.1 as in Equation 25 may introduce inaccuracy. 
20 [0329] Equation 25 can be modified to account for this temporal and wavelength dependence, 
as shown in Equation 28: 

R(i,X,t') = MiXtl I OWiXf » ] ' KUi,U)>i / Irc(i,U) ] * R^«, (28) 
where Rcp,^ is an array of values of a second-order polynomial curve fit of R^ shown in 
Equation 27. The polynomial curve fit reduces the noise in the R* p array. Other curve fits may 

25 be used alternatively. For example, Figure 22A shows a graph 490 depicting, for seven 

individual instruments, curves 496, 498, 500, 502, 504, 506, 508 of sample reflectance intensity 
using the BB1 white light source 188 as depicted in Figures 4, 7 and 8 graphed as functions of 
wavelength 494. Each of the seven curves represents a mean of reflectance intensity at each 
wavelength, calculated using Equation 25 for regions confirmed as metaplasia by impression. 

30 Figure 22B shows a graph 509 depicting corresponding curves 510, 512, 514, 516, 518, 520, 522 
of test sample reflectance intensity calculated using Equation 28, where R^ varies with time and 
wavelength. The variability between individual instrument units decreases when using measured 
values for Rc P as in Equation 28 rather than as a constant value. The variability between 
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reflectance spectra obtained from samples having a common tissue-class/state-of-health 

classification, but using different instrument units decreases when using measured values for Rc P 

as in Equation 28 rather than a constant value as in Equation 25. 

[0330] In an alternative embodiment, processing of reflectance data includes applying 
5 Equation 28 without first fitting values to a quadratic polynomial. Thus, processing is 

performed in accordance with Equation 29 to adjust data from a test sample, I m , to produce 

adjusted reflectance R, as follows: 

R(iX?) = [UiXn I <I CP W)>i ] ■ [<Irc(i,U)>, / IrAU) ] ' Rep (29) 

[0331] Applying Equation 29, however, introduces an inconsistency in the reflectance spectra 
10 at about 490 nm, caused, for example, by the intensity from the 60% reflectivity factory 

calibration target exceeding the linear range of the CCD array. This can be avoided by using a 

darker factory calibration target in the factory NIST target test 314, for example, a target having 

a known diffuse reflectance from about 10% to about 30%. 

[0332] Results from the factory/PM custom target test 3 12, the factory/PM NIST target test 
15 3 14, and the pre-patient custom target test 330 provide the correction factors shown in blocks 
324, 326, and 334, respectively used in preprocessing reflectance data from a patient scan using 
the BB1 white light source 188 and the BB2 white light source 190 shown in Figures 4, 7, and 8. 
Correction factors in block 324 represent background-subtracted, power-monitor-corrected 
(power-corrected), and null-target-subtracted reflectance data from a given factory/PM custom 
20 target test 3 1 2 (cp) and are shown in Equations 30 and 3 1 : 

FCCTMMBB1 = a^,(iM)X— - FCNULLBB1 (30) 
FCCTMMBB2 = (U(iM)X— - FCNULLBB2 (31) 
where FCNULLBB1 and FCNULLBB2 are given by Equations 4 and 5, and < \ mAaA represents 
a one-dimensional array of mean data computed on a pixel-by-pixel basis in regions of area 445 
25 of the scan pattern 444 of Figure 20. 

[0333] Correction factors in block 326 of Figure 1 0 represent ratios of background-subtracted, 
power-corrected, and null-target-subtracted reflectance data from a factory/PM custom target test 
3 12 (cp) and a factory/PM NIST standard target test 314 (fc) and are shown in Equations 32, 33, 
and 34: 

30 FCBREF1[]= <I f .BB,(i,U)av,of4 - FCNULLBB1X (32) 

WiMXvgof4 - FCNULLBB1 
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FCBREF2[] = (WflMUof* - FCNULLBB2X 
WCM)^™ - FCNULLBB2 



(33) 



5 



CALREF = [0.5(FCCTMBB1/<FCBREF1[])0 + 
(FCCTMBB2/(FCBREF2[]>0 U« 



(34) 



where values of the two-dimensional arrays I fcB Bi and l TCtBB2 are averages of data using the target at 
each of four positions, rotated 90° between each position; and all divisions, subtractions, and 
10 multiplications are on a location-by-location, pixel-by-pixel basis. The correction factor, 

CALREF, is a one-dimensional array of values of the quantity in brackets [ ] on the right side of 
Equation 34, interpolated such that they correspond to wavelengths at 1-nm intervals between A, - 
360 nm and X = 720 nm. The interpolated values are then fit with a quadratic or other polynomial 
to reduce noise. 

1 5 [0334] Correction factors in block 3 3 4 of Figure 1 0 represent background-subtracted, power- 
corrected, intemal-stray-light-corrected reflectance data from a pre-patient custom target test 330 
(cp) and are given in Equations 35 and 36 as follows: 



20 where SLBB1 and SLBB2 are as shown in Equations 19 and 20. 

[0335] Steps #4, 5, and 6 in each of blocks 342 and 344 of the spectral data pre-processing 
block diagram 340 of Figure 1 1 include processing patient reflectance data using the correction 
factors from blocks 324, 326, and 334 of Figure 10 computed using results of the factory/PM 
custom target test 3 12, the factory/PM NIST standard target test 314, and the pre-patient custom 

25 target test 330. 

[0336] In step #4 of block 342 in Figure 1 1, the array of background-subtracted, power- 
corrected, internal-stray-light-subtracted patient reflectance data obtained using the BB1 light 
source is multiplied by the two-dimensional array correction factor, FCBREF1[], and then in 
step #5, is divided by the correction factor BREFMBB1. After filtering using, for example, a 5- 
30 point median filter and a second-order 27-point Savitsky-Golay filter, the resulting array is 

linearly interpolated using results of the wavelength calibration step 302 in Figure 10 to produce 
a two-dimensional array of spectral data corresponding to wavelengths ranging from 360 nm to 
720 nm in 1-nm increments at each of 499 interrogation points of the scan pattern 202 shown in 



BREFMBB1 = <I cp>BBI (i,X,t 5 ) - SLBB1); 
BREFMBB2 = Q^QXt 9 )- SLBB2), 



(35) 
(36) 
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Figure 5. This array is multiplied by CALREF in step #6 of block 342 in Figure 11, and pre- 
processing of the BB1 spectral data in this embodiment is complete. 
[0337] Steps #4, 5, and 6 in block 344 of Figure 1 1 concern processing of BB2 data and is 
directly analogous to the processing of BB1 data discussed above. 

5 [0338] Steps #4 and 5 in block 346 of Figure 1 include processing fluorescence data using 
factory/PM-level correction factors, applied after background correction (step #1), power 
monitor correction (step #2), and stray light correction (step #3) of fluorescence data from a test 
sample. Steps #4 and 5 include application of correction factors sFCDYE and IRESPONSE, 
which come from the factory/PM fluorescent dye cuvette test 306 and the factory/PM tungsten 

10 source test 308 in Figure 10. 

[0339] The factory/PM tungsten source test 308 accounts for the wavelength response of the 
collection optics for a given instrument unit. The test uses a quartz tungsten halogen lamp as a 
light source. Emission from the tungsten filament approximates a blackbody emitter. Planck's 
radiation law describes the radiation emitted into a hemisphere by a blackbody (BB) emitter: 



where a = 27rhc 2 = 3.742 x 10 16 [W(nm) 4 /cm 2 ]; b = hc/k = 1.439 x 10 7 [(nm)K]; T is source 
temperature; CE is a fitted parameter to account for collection efficiency; and both T and CE are 
treated as variables determined for a given tungsten lamp by curve-fitting emission data to 
Equation 37. 

20 [0340] The lamp temperature, T, is determined by fitting NIST-traceable source data to 

Equation 37. Figure 23 shows a graph 582 depicting the spectral irradiance 584, W mST]amp , of a 
NIST-traceable quartz-tungsten-halogen lamp, along with a curve fit 590 of the data to the model 
in Equation 37 for blackbody irradiance, W BB . Since the lamp is a gray-body and not a perfect 
blackbody, Equation 37 includes a proportionality constant, CE. This proportionality constant 

25 also accounts for the "collection efficiency" of the setup in an instrument 102 as depicted in the 
tissue characterization system 100 of Figure 1. In the illustrative embodiment, the target from 
which measurements are obtained is about 50-cm away from the lamp and has a finite collection 
cone that subtends a portion of the emission hemisphere of the lamp. Thus, while W BB (X) in 
Equation 37 has units of [W/nm], calibration values for a given lamp used in the instrument 102 

30 in Figure 1 has units of [W/cm 2 -nm at 50 cm distance]. The two calibration constants, CE and T, 
are obtained for a given lamp by measuring the intensity of the given lamp relative to the 
intensity of aNIST-calibrated lamp using Equation 38: 



WbbW » [a • (CE)] / [ X 5 • {exp(bAT) - 1 }] 



(37) 



Wi am p — [I lamp / IwiST lump ] " Wj 



NIST lamp 



(38) 
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Then, values of T and CE are determined by plotting W lninp versus wavelength and curve-fitting 
using Equation 37. The curve fit provides a calibrated lamp response, l hnp (X) y to which the 
tungsten lamp response measured during factory/PM testing 308 at a given interrogation point 
and using a given instrument, Sum P (i,X), is compared. This provides a measure of "instrument 
5 response", IR(i,A,), f° r the given point and the given instrument, as shown in Equation 39: 

mCW-Sh^X)/!^) (39) 
[0341] The factory/PM tungsten source test 308 in Figure 10 includes collecting an intensity 
signal from the tungsten lamp as its light reflects off an approximately 99% reflective target. 
The test avoids shadowing effects by alternately positioning the tungsten source at each of two 

10 locations — for example, on either side of the probe head 192 at locations corresponding to the 
white light source locations 188, 190 shown in Figure 8 - and using the data for each given 
interrogation point corresponding to the source position where the given point is not in shadow. 
[0342] Once the instrument response measure, IR(i,k), is obtained, a correction factor is 
determined such that its value is normalized to unity at a given wavelength, for example, at X = 

15 500 nm. Thus, the distance between the lamp and the detecting aperture, the photoelectron 

quantum efficiency of the detector, and the reflectivity of the target do not need to be measured. 
[0343] According to the illustrative embodiment, the fluorescence component of the spectral 
data pre-processing 1 14 of the system 1 00 of Figure 1 corrects a test fluorescence intensity 
signal, S F (i,X), for individual instrument response by applying Equation 40 to produce I F (i,X), the 

20 instrument-response-corrected fluorescence signal: 

I F (U) = S F (i,X) + [{500-IRftX)>/{X- IR(i,500)} ] (40) 
where IR(i,500) is the value of the instrument response measure IR at point i and at wavelength X 
= 500 nm; and where the term A/500 converts the fluorescence intensity from energetic to 
photometric units, proportional to fluorophore concentration. In one embodiment, the 

25 differences between values of IR at different interrogation points is small, and a mean of JR(X) 
over all interrogation points is used in place of IR(i,A,) in Equation 40. 
[0344] The fluorescent dye cuvette test 306 accounts for variations in the efficiency of the 
collection optics 200 of a given instrument 102. Fluorescence collection efficiency depends on a 
number of factors, including the spectral response of the optics and detector used. In one 

30 embodiment, for example, the collection efficiency tends to decrease when a scan approaches the 
edge of the optics. A fluorescent dye cuvette test 306, performed as part of factory and/or 
preventive maintenance (PM) calibration, provides a means of accounting for efficiency 
differences. 
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[0345] An about 50-mm-diameter cuvette filled with a dye solution serves as a target for the 
fluorescent dye cuvette test 306 to account for collection optic efficiency variation with 
interrogation point position and variation between different units. The factory/PM dye-filled 
cuvette test 306 includes obtaining the peak intensity of the fluorescence intensity signal at each 
5 interrogation point of the dye-filled cuvette, placed in the calibration target port of the instrument 
102, and comparing it to a mean peak intensity of the dye calculated for a plurality of units. 
[0346] Illustratively, a calibrated dye cuvette can be prepared as follows. First, the 
fluorescence emission of a 10-mm-pathlength quartz cuvette filled with ethylene glycol is 
obtained. The ethylene glycol is of 99+% spectrophotometry quality, such as that provided by 

10 Aldrich Chemical Company. The fluorescence emission reading is verified to be less than about 
3000 counts, particularly at wavelengths near the dye peak intensity. An approximately 2.5 x 
10" 4 moles/L solution of coumarin-515 in ethylene glycol is prepared. Coumarin-515 is a 
powdered dye of molecular weight 347, produced, for example, by Exciton Chemical Company. 
The solution is diluted with ethylene glycol to a final concentration of about 1 2 x 10~ 5 moles/L. 

15 Then, a second 10-mm-pathlength quartz cuvette is filled with the coumarin-515 solution, and an 
emission spectrum is obtained. The fluorescence emission reading is verified to have a 
maximum between about 210,000 counts and about 250,000 counts. The solution is titrated with 
either ethylene glycol or concentrated courmarin-5 15 solution until the peak lies in this range. 
Once achieved, 50-mm-diameter quartz cuvettes are filled with the titrated standard solution and 

20 flame-sealed. 

[0347] A correction factor for fluorescence collection efficiency can be determined as follows. 
First, the value of fluorescence intensity of an instrument-response-corrected signal, I F (i,k), * s 
normalized by a measure of the UV light energy delivered to the tissue as in Equation 41 : 

F^iX) - [ MU) / Pj(i) ] • [ P ra / U (41) 

25 where F T (i,X) is the instrument-response-corrected, power-monitor-corrected fluorescence 
intensity signal; P ra (i) is a power-monitor reading that serves as an indirect measure of laser 
energy, determined by integrating or adding intensity readings from pixels on a CCD array 
corresponding to a portion on which a beam of the output laser light is directed; and [ P m / E^, 
Jfopm is the ratio of power monitor reading to output laser energy determined during factory 

30 calibration and/or preventive maintenance (FC/PM). 

[0348] Next, the illustrative embodiment includes obtaining the fluorescence intensity 
response of a specific unit at a specific interrogation point (region) in its scan pattern using a 
cuvette of the titrated coumarin-515 dye solution as the target, and comparing that response to a 
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mean fluorescence intensity response calculated for a set of units, after accounting for laser 
energy variations as in Equation 41. Equation 42 shows a fluorescence collection efficiency 
correction factor for a given unit applied to an instrument-response-corrected fluorescence 
signal, I F (i^) 5 along with the energy correction of Equation 41 : 



where hyJiUK) is the peak measured fluorescence intensity at interrogation position i using the 
dye-filled cuvette, as shown in Figure 31; Xp is the wavelength (or its approximate pixel index 
equivalent) corresponding to the peak intensity; and the quantity in brackets ( > InstnjBeiIIS is the mean 
power-corrected intensity at interrogation point 251, corresponding to the center of the 

10 exemplary scan pattern of Figure 5, calculated for a plurality of units. 

[0349] The fluorescence collection efficiency tends to decrease when the scans approach the 
edge of the optics. Figure 24 shows typical fluorescence spectra from the dye test 306. The 
graph 614 in Figure 24 depicts as a function of wavelength 618 the fluorescence intensity 616 of 
the dye solution at each region of a 499-point scan pattern. The curves 620 all have 

15 approximately the same peak wavelength, \ 9 but the maximum fluorescence intensity values 
vary. 

[0350] Figure 25 shows how the peak fluorescence intensity (intensity measured at pixel 131 
corresponding approximately to Xp) 624, determined in Figure 24, varies as a function of scan 
position (interrogation point) 626. Oscillations are due at least in part to optic scanning in the 
20 horizontal plane, while the lower frequency frown pattern is due to scan stepping in the vertical 
plane. According to the illustrative embodiment, curves of the fluorescence intensity of the dye 
cuvette at approximate peak wavelength are averaged to improve on the signal-to-noise ratio. 
[0351] Equation 42 simplifies to Equations 43 and 44 as follows: 



5 




P m (0 > 



(42) 



J 




(43) 



PM 



FCDYEii) 
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(44) 

The term, [ P ra / E M ] PM , drops out of equation 42. Variations in laser energy measurements 
become less important as the energy is averaged over multiple measurements made on many 
instruments. 

[0352] In Figure 10, the correction factor sFCDYE in block 3 1 8 is a one-dimensional scalar 
array and is calculated using Equation 45: 

7 V(25U,) P\ 
\ P m (2Sl) E^J [nstnimenfs 



sFCDYE = 



(45) 



Here, values of I Dye (i,Xp) are background-subtracted, power-corrected, and null-target-subtracted. 
[0353] In Figure 10, the correction factor IRESPONSE in block 320 is a one-dimensional 
array and is calculated using the results of the factory/PM tungsten source test 308, as in 
Equation 46: 

IRESPONSE = [{500-IR(i,X)}/{X- IR(i,500)} ] (46) 
where IR(i,500) is the value of the instrument response measure IR given in Equation 39 at point 
i and at wavelength X = 500 run. 

[0354] Steps #4 and 5 in block 346 of the fluorescence spectral data pre-processing block 
diagram 340 of Figure 1 1 include processing fluorescence data using sFCDYE and 
IRESPONSE as defined in Equations 45 and 46. The fluorescence data pre-processing proceeds 
by background-subtracting, power-correcting, and stray-light-subtracting fluorescence data from 
a test sample using BkgndQ, sPowerMonitorQ, and SLFL as shown in Steps #1, 2, and 3 in 
block 346 of Figure 1 L Then, the result is multiplied by sFCDYE and divided by IRESPONSE 
on a pixel-by-pixel, location-by-location basis. Next, the resulting two-dimensional array is 
smoothed using a 5-point median filter, then a second-order, 27-point Savitsky-Golay filter, and 
interpolated using the pixel-to-wavelength conversion determined in block 302 of Figure 10 to 
produce an array of data corresponding to a spectrum covering a range from 360 nm to 720 nm 
at 1-nm intervals, for each of 499 interrogation points of the scan pattern. 
[0355] As a further feature, the stability of fluorescence intensity readings are monitored 
between preventive maintenance procedures. This may be performed prior to each patient scan 
by measuring the fluorescence intensity of the center plug 438 of the custom target 426 shown in 
Figure 19 and comparing the result to the expected value from the most recent preventive 
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maintenance test. If the variance from the expected value is significant, and/or if the time 
between successive preventive maintenance testing is greater than about a month, the following 
correction factor may be added to those in block 346 of Figure 1 1 : 



7,(251,*,)' 
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5 where PM denotes preventive maintenance test results; PP denotes pre-patient test results; 
I rt (251, Xp) is the fluorescence peak intensity reading at scan position 251 (center of the custom 
target) at peak wavelength \; and P m is the power monitor reading at scan position 25 1. 
[0356] The spectral data pre-processing 1 14 in Figure 1 1 further includes a procedure for 
characterizing noise and/or applying a threshold specification for acceptable noise performance. 

10 Noise may be a significant factor in fluorescence spectral data measurements, particularly where 
the peak fluorescence intensity is below about 20 counts/^ (here, and elsewhere in this 
specification, values expressed in terms of counts/^J are interpretable in relation to the mean 
fluorescence of normal squamous tissue being 70 ct/^iJ at about 450 nm). 
[0357] The procedure for characterizing noise includes calculating a power spectrum for a nidi 

15 target background measurement. The null target background measurement uses a null target 
having about 0% reflectivity, and the measurement is obtained with internal lights off and 
optionally with all external lights turned off so that room lights and other sources of stray light 
do not affect the measurement. Preferably, the procedure includes calculating a mean null target 
background spectrum of the individual null target background spectra at all interrogation points 

20 on the target - for example, at all 499 points of the scan pattern 202 of Figure 5. Then, the 
procedure subtracts the mean spectrum from each of the individual null target background 
spectra and calculates the Fast Fourier Transform (FFT) of each mean-subtracted spectrum. 
Then, a power spectrum is calculated for each FFT spectrum and a mean power spectrum is 
obtained. 

25 [0358] Figure 26 shows a graph 678 depicting exemplary mean power spectra for various 
individual instruments 684, 686, 688, 690, 692, 694, 696. A 27-point Savitzky-Golay filter has 
an approximate corresponding frequency of about 6300 s" 1 and frequencies above about 20,000 
f x are rapidly damped by applying this filter. In the case of a 27-point Savistzky-Golay filter, 
spectral data pre-processing in Figure 1 1 further includes applying a threshold maximum 
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criterion of 1 count in the power spectrum for frequencies below 20,000 s" 1 . Here, data from an 
individual unit must not exhibit noise greater than 1 count at frequencies below 20,000 s" 1 in 
order to satisfy the criterion. In Figure 26, the criterion is not met for units with curves 692 and 
696, since their power spectra contain points 706 and 708, each exceeding 1 count at frequencies 
5 below 20,000 s" 1 . The criterion is met for all other units. 

[0359] According to an alternative illustrative embodiment, a second noise criterion is applied 
instead of or in addition to the aforementioned criterion. The second criterion specifies that the 
mean power spectral intensity for a given unit be below 1 .5 counts at all frequencies. In Figure 
26, the criterion is not met for units with curves 692 and 696, since their power spectra contain 
10 points 700 and 702, each exceeding 1.5 counts. 

[0360] The illustrative spectral data pre-processing 1 14 in Figure 1 1 and/or the factory/PM 110 
and pre-patient calibration 116 and correction in Figure 10 further includes applying one or more 
validation criteria to data from the factory/PM 110 and pre-patient 1 14 calibration tests. The 
validation criteria identify possibly-corrupted calibration data so that the data are not 
15 incorporated in the core classifier algorithms and/or the spectral masks of steps 132 and 130 in 
the system 100 of Figure 1. The validation criteria determine thresholds for acceptance of the 
results of the calibration tests. According to the illustrative embodiment, the system 100 of 
Figure 1 signals if validation criteria are not met and/or prompts retaking of the data. 
[0361] Validation includes validating the results of the factory/PM NIST 60% diffuse 
20 reflectance target test 3 14 in Figure 10. Validation may be necessary, for example, because the 
intensity of the xenon lamp used in the test 3 14 oscillates during a scan over the 25-mm scan 
pattern 202 of Figure 5. The depth of modulation of measured reflected light intensity depends, 
for example, on the homogeneity of the illumination source at the target, as well as the collection 
efficiency over the scan field. The depth of modulation also depends on how well the target is 
25 aligned relative to the optical axis. In general, inhomogeneities of the illumination source are 
less important than inhomogeneities due to target misalignment, since illumination source 
inhomogeneities are generally accounted for by taking the ratio of reflected light intensity to 
incident light intensity. Thus, the calibration 1 10, 1 16 methods use one or two metrics to sense 
off-center targets and prompt retaking of data. 
30 [0362] One such metric includes calculating a coefficient of variation, CV£X), of measured 
reflected light intensity across the scan field according to Equation 48: 

CTiW- S ' d( f/'\ (48) 
meanylyAj)). 



WO 2004/005895 



PCT/US2003/021347 



-75- 

where I(X,i) = mean [{I^&i) - WW) }/Pm(i)]4 "std" represents standard deviation; i 
represents an interrogation point; X represents wavelength (in one embodiment, between 370 nm 
and 700 nm); and P m (i) represents the power monitor value for interrogation point i. I(A,,i) is the 
mean of the background-subtracted (bkg), power-monitor-corrected reflectance intensity values 
5 from the NIST target measured 4 times, rotating the target 90° between each measurement. 
Validation according to the metric of Equation 48 requires the value of CV,(X) be less than an 
experimentally-determined, fixed value. 

[0363] Another metric from the 60% diffuse target test 314 includes calculating the relative 
difference, RD, between the minimum and maximum measured intensity over the scan field 
10 according to Equation 49: 

^■ ^^ (49 ) 



where I (X,t) = mean 



P m (i) 



mean{Pm(i)) i 

J 



Here, F is scaled by the 
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mean of the power monitor values. In one embodiment, the relative difference, RD, between the 
minimum and maximum computed in Equation 49 is more sensitive to off-centered targets than 

15 the coefficient of variation, CV b computed in Equation 48. Here, validation requires the value of 
RD(A) be less than an experimentally-determined, fixed value. In the illustrative embodiment, 
validation requires that Equation 50 be satisfied as follows: 

RD(A) < 0.7 for X between 370 nm and 700 nm (50) 
where KD(X) is given by Equation 49. 

20 [0364] Validation also includes validating the results of the tungsten source test 308 from 
Figure 1 1 using the approximately 99% diffuse reflectivity target. This test includes obtaining 
two sets of data, each set corresponding to a different position of the external tungsten source 
lamp. Data from each set that are not affected by shadow are merged into one set of data. Since 
the power monitor correction is not applicable for this external source, a separate background 

25 measurement is obtained. 

[0365] The illustrative calibration methods 1 1 0, 1 1 6 use one or two metrics to validate data 
from the tungsten source test 308. One metric includes calculating a coefficient of variation, 
CV£X), of the mean foreground minus the mean background data, W(\,i), of the merged set of 
data, as in Equation 5 1 : 

CFiffl- ^4 PI) 
meamJV[A 9 i))i 
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where the coefficient of variation, CV{X), is calculated using the mean instrument spectral 
response curve, IR, averaging over all interrogation points of the scan pattern. Validation 
requires the value of CV(X) be less than an experimentally-determined, fixed value. In the 
illustrative embodiment, validation requires that Equation 52 be satisfied for all interrogation 
5 points i: 

CVi(X) < 0.5 for X between 370 nm and 700 nm (52) 
where CVj(X) is given by Equation 51 . 

[0366] A second metric includes calculating a mean absolute difference spectrum, MAD(X), 
comparing the current spectral response curve to the last one measured, as in Equation 53: 
10 MAD(X) = mean^wAU^-IR^ii,*]), (53) 

where the instrument spectral response curve, IR, is given by Equation 39. Validation requires 
the value of MAD(X) be less than an experimentally-determined, fixed value. In one 
embodiment, validation requires that Equation 54 be satisfied: 

MAD(A,) < 0.2 for X between 370 nm and 700 nm (54) 

1 5 where MAD(X) is given by Equation 53 . 

[0367] Validation can further include validating the results of the fluorescent dye cuvette test 
306 in Figure 10, used to standardize fluorescence measurements between individual units and to 
correcting for variation in collection efficiency as a unit collects data at interrogation points of a 
scan pattern. The illustrative calibration methods 1 10, 1 16 use one or more metrics to validate 

20 data from the fluorescent dye cuvette test 306 using a coefficient of variation, CV,(X), of dye 
cuvette intensity, I Dyc , as in Equation 55: 

[0368] The coefficient of variation, CV{X) 9 in Equation 55 between about 470 nm and about 
600 nm is generally representative of fluorescence efficiency variations over the scan pattern. 

25 The coefficient of variation at about 674 nm is a measure of how well the collection system 

blocks the 337-nm excitation light. As the excitation light passes over the surface of the cuvette, 
the incidence and collection angles go in and out of phase, causing modulation around 574 nm. 
The coefficient of variation at about 425 nm is a measure of the cleanliness of the cuvette surface 
and is affected by the presence of fingerprints, for example. The coefficient of variation below 

30 about 400 nm and above about 700 nm is caused by a combination of the influence of 337-nm 
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stray excitation light and reduced signal-to-noise ratio due to limited fluorescence from the dye 
solution at these wavelengths. 

[0369] One metric includes calculating a mean coefficient of variation, CV jQC), according to 
Equation 55, between about 500 nm and about 550 nm, and comparing the mean coefficient of 

5 variation to an experimentally-determined, fixed value. According to the illustrative 
embodiment, validation requires that Equation 56 be satisfied: 

mean CV S (A,) < 0.06 for X between 500 nm and 550 nm (56) 
[0370] A second metric includes requiring the coefficient of variation at about 674 nm be less 
than an experimentally-determined, fixed value. In one embodiment, validation requires that 

10 Equation 57 be satisfied for all interrogation points i: 

CV<674)<0.5 (57) 
where CV^) is calculated as in Equation 55. 

[0371] Validation can also include validating results of the fluorescent dye cuvette test 306 
using both Equations 56 and 57. Here, applying Equation 56 prevents use of data from tests 
15 where the scan axis is significantly shifted relative to the center of the optical axis, as well as 
tests where the cuvette is not full or is off-center. Applying Equation 57 prevents use of data 
from tests where a faulty UV emission filter is installed or where the UV filter degrades over 
time, for example. 

[0372] Validation can also include validating the results of the 10% diffuse reflectivity custom 
20 target tests 312, 330 in Figure 10. Here, an off-center target may result in a faulty test due to 
interference at regions near the edge of the target, as well as regions near the fluorescent and 
phosphorescent plugs that are improperly masked. According to the illustrative embodiment, 
validation of the custom target tests 3 12, 330 requires that the relative difference between the 
minimum and maximum intensity, RD(X), is below a pre-determined value, where RD(X) is 
25 calculated as in Equation 5 8 : 

where (T^i))^ refers to all scan positions except those masked to avoid the plugs, as shown 
in Figures 19 and 20. In one embodiment, validation requires that Equation 59 be satisfied: 

RD(X) < 1 .2 for X between 370 nm and 700 nm (59) 
30 where RD(X) is calculated as in Equation 58. 

[0373] The invention can also validate the results of the null target test 304, 328 in Figure 10. 
The null target test is used, for example, to account for internal stray light in a given instrument. 
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According to the illustrative embodiment, a maximum allowable overall amount of stray light is 
imposed. For example, in one preferred embodiment, validation of a null target test 304, 328 
requires the integrated energy, IE, be below a predetermined value, where IE is calculated from 
background-subtracted, power-monitor-corrected null target reflectance intensity measurements, 
as in Equation 60: 

P»>(!) 



IE =L mean \ pn mean{P m {i)) t dX (60) 
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where AX in the summation above is about 1-nm. In one embodiment, validation requires that 
Equation 61 be satisfied: 

10 IE < 4000 counts (61) 

where IE is calculated as in Equation 60. 

[0374] The invention may also employ validation of the open air target test 3 1 0 in Figure 10. 
Like the null target test 304, 328, the open air target test is used in accounting for internal stray 
light in a given instrument. According to the illustrative embodiment, validation of an open air 

15 target test 3 1 0 requires the integrated energy, IE, be below a predetermined value, where IE is 
calculated as in Equation 60, except using open air reflectance intensity measurements in place 
of null target measurements, null(k,i). By way of example, in one case validation requires that 
the value of integrated energy for the open air test be below 1 .2 times the integrated energy from 
the null target test, calculated as in Equation 60. 

20 [0375] According to another feature, the invention validates the power monitor corrections 
used in the calibration tests in Figure 10. Patient and calibration data that use a power monitor 
correction may be erroneous if the illumination source misfires. According to one approach, 
validation of a power monitor correction requires that the maximum raw power monitor intensity 
reading, P^©, be greater than a predetermined minimum value and/or be less than a 

25 predetermined maximum value at each interrogation point i. In the illustrative embodiment, 
validation requires that Equation 62 be satisfied: 

6000 counts < P^i) < 30,000 counts for all i (62) 
[0376] According to the illustrative embodiment, spectral data pre-processing 1 14 in Figure 1 1 
includes accounting for the result of the real-time motion tracker 106 in the system 100 of Figure 

30 1 when applying the correction factors in block diagram 340 of Figure 1 1 . As discussed herein, 
the system 100 of Figure 1 applies the calibration-based corrections in Figure 1 1 to spectral data 
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acquired from a patient scan. These corrections are applied by matching spectral data from each 
interrogation point in a patient scan to calibration data from a corresponding interrogation point. 
However, a patient scan of the 499 interrogation points shown in the scan pattern 202 of Figure 5 
takes approximately 12 seconds. During those 12 seconds, it is possible that the tissue will shift 
5 slightly, due to patient movement. Thus, spectral data obtained during a scan may not 

correspond to an initial index location, since the tissue has moved from its original position in 
relation to the scan pattern 202. The real-time motion tracker 106 of Figure 1 accounts for this 
movement by using data from video images of the tissue to calculate, as a function of scan time, 
a translational shift in terms of an x-displacement and a y-displacement The motion tracker 1 06 

10 also validates the result by determining whether the calculated x,y translational shift accurately 
accounts for movement of the tissue in relation to the scan pattern or some other fixed standard 
such as the initial position of component(s) of the data acquisition system (the camera and/or 
spectroscope). The motion tracker 106 is discussed in more detail below. 
[0377] Illustratively, the spectral data pre-processing 1 14 in Figure 1 1 accounts for the result 

15 of the real-time motion tracker 106 by applying a calibration spectra lookup method. The lookup 
method includes obtaining the motion-corrected x,y coordinates corresponding to the position of 
the center of an interrogation point from which patient spectral data is obtained during a patient 
scan. Then the lookup method includes using the x,y coordinates to find the calibration data 
obtained from an interrogation point whose center is closest to the x,y coordinates. 

20 [0378] The scan pattern 202 of Figure 5 is a regular hexagonal sampling grid with a pitch 
(center-to-center distance) of 1.1 mm and a maximum interrogation point spot size of 1 mm. 
This center-to-center geometry indicates a horizontal pitch of 1.1 mm, a vertical pitch of about 
0.9527 mm, and a maximum corner distance of the circumscribed regular hexagon to the center 
of 0.635 mm. Thus, the illustrative lookup method finds the calibration interrogation point 

25 whose center is closest to the motion-corrected x,y coordinates of a patient scan interrogation 
point by finding coordinates of a calibration point that is less than 0.635 mm from x,y. 
[0379] The background spectra, Bkgndfl, in Figure 1 1, are obtained at nearly the same time 
patient spectral data are obtained and no motion correction factor is needed to background- 
subtract patient spectral data. For example, at a given interrogation point during a patient scan, 

30 the system 100 of Figure 1 pulses the UV light source on only while obtaining fluorescence data, 
then pulses the BB 1 light source on only while obtaining the first set of reflectance data, then 
pulses the BB2 light source on only while obtaining the second set of reflectance data, then 
obtains the background data, BkgndQ, at the interrogation point with all internal light sources 
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off. All of this data is considered to be approximately simultaneous and no motion correction 
factor is needed for the Bkgnd[] calibration data. 

[0380] The real-time motion tracker 1 06 of Figure 1 uses video data obtained from the tissue 
contemporaneously with the spectral data. In addition to motion correction, the system of Figure 
5 1 uses video (image) data to determine image masks for disease probability computation, to 
focus the probe 142 through which spectral and/or image data is acquired, and to compute a 
brightness and contrast correction and/or image enhancement for use in disease overlay display. 

Patient scan procedure 

[0381] Figure 27A is a block diagram 714 showing steps an operator performs before a patient 

10 scan as part of spectral data acquisition 104 in the system 100 of Figure 1, according to an 

illustrative embodiment of the invention. The steps in Figure 27A are arranged sequentially with 
respect to a time axis 716. As shown, an operator applies a contrast agent to the tissue sample 
718, marks the time application is complete 720, focuses the probe 142 through which spectral 
and/or image data will be obtained 722, then initiates the spectral scan of the tissue 724 within a 

1 5 pre-determined window of time. 

[0382] According to the illustrative embodiment, the window of time is an optimum range of 
time following application of contrast agent to tissue within which an approximately 12 to 15 
second scan can be performed to obtain spectral data that are used to classify tissue samples with 
a high degree of sensitivity and selectivity. The optimum window should be long enough to 

20 adequately allow for restarts indicated by focusing problems or patient movement, but short 
enough so that the data obtained is consistent. Consistency of test data is needed so that tissue 
classification results for the test data are accurate and so that the test data may be added to a bank 
of reference data used by the tissue classification scheme. In one illustrative embodiment, the 
optimum window is expressed in terms of a fixed quantity of time following application of 

25 contrast agent. In another illustrative embodiment, the optimum window is expressed in terms of 
a threshold or range of a trigger signal from the tissue, such as a reflectance intensity indicative 
of degree of tissue whiteness. 

[0383] The contrast agent in Figure 27A is a solution of acetic acid. According to one 
exemplary embodiment, the contrast agent is a solution between about 3 volume percent and 
30 about 6 volume percent acetic acid in water. More particularly, in one preferred embodiment, 
the contrast agent is an about 5 volume percent solution of acetic acid in water. Other contrast 
agents may be used, including, for example, formic acid, propionic acid, butyric acid, Lugol's 
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iodine, Shiller's iodine, methylene blue, toluidine blue, indigo carmine, indocyanine green, 
fluorescein, and combinations of these agents. 

[0384] According to the illustrative embodiment, the time required to obtain results from a 
patient scan, following pre-patient calibration procedures, is a maximum of about 5 minutes. 

5 Thus, in Figure 27 A, the five-minute-or-less procedure includes applying acetic acid to the tissue 
sample 726; focusing the probe (142) 728; waiting, if necessary, for the beginning of the 
optimum pre-determined window of time for obtaining spectral data 730; obtaining spectral data 
at all interrogation points of the tissue sample 732; and processing the data using a tissue 
classification scheme to obtain a diagnostic display 734. The display shows, for example, a 

10 reference image of the tissue sample with an overlay indicating regions that are classified as 

necrotic tissue, indeterminate regions, healthy tissue (no evidence of disease, NED), and CIN 2/3 
tissue, thereby indicating where biopsy may be needed. 

[0385] The times indicated in Figure 27A may vary. For example, if the real-time motion 
tracker 106 in the system of Figure 1 indicates too much movement occurred during a scan 732, 

15 the scan 732 may be repeated if there is sufficient time left in the optimum window. 

[0386] Figure 27B is a block diagram 738 showing a time line for the spectral scan 732 
indicated in Figure 27A. In the embodiment shown in Figure 27B, a scan of all interrogation 
points of the scan pattern (for example, the scan pattern 202 of Figure 5) takes from about 12 
seconds to about 15 seconds, during which time a sequence of images is obtained for motion 

20 tracking, as performed in step 106 of the system 100 of Figure 1. By the time a scan begins, a 
motion-tracking starting image 742 and a target laser image 744 have been obtained 740. The 
target laser image 744 may be used for purposes of off-line focus evaluation, for example. 
During the acquisition of spectral data during the scan, a frame grabber 120 (Figure 1) obtains a 
single image about once every second 746 for use in monitoring and/or correcting for movement 

25 of the tissue from one frame to the next. In Figure 27B, a frame grabber acquires images 748, 
750, 752, 754, 756, 758, 760, 762, 764, 766, 768 that are used to track motion that occurs during 
the scan. 

[0387] Image data from a video subsystem is used, for example, in target focusing 728 in 
Figure 27A and in motion tracking 106, 746 in Figure 27B. Image data is also used in detecting 
30 the proper alignment of a target in a calibration procedure, as well as detecting whether a 
disposable is in place prior to contact of the probe with a patient. Additionally, in one 
embodiment, colposcopic video allows a user to monitor the tissue sample throughout the 
procedure. 
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Video calibration and focusing 
[0388] Figure 28 is a block diagram 770 that shows the architecture of an illustrative video 
subsystem used in the system 100 of Figure 1. Figure 28 shows elements of the video subsystem 
in relation to components of the system 100 of Figure 1 . The video subsystem 770 acquires 

5 single video images and real-time (streaming) video images. The video subsystem 770 can post- 
process acquired image data by applying a mask overlay and/or by adding other graphical 
annotations to the acquired image data. Illustratively, image data is acquired in two frame 
buffers during real-time video acquisition so that data acquisition and data processing can be 
alternated between buffers. The camera(s) 772 in the video subsystem 770 of Figure 28 include 

10 a camera located in or near the probe head 1 92 shown in Figure 4, and optionally includes a 
colposcope camera external to the probe 142 for visual monitoring of the tissue sample during 
testing. In one illustrative embodiment, only the probe head camera is used. Figure 28 shows a 
hardware interface 774 between the cameras 772 and the rest of the video subsystem 770. The 
frame grabber 120 shown in Figure 1 acquires video data for processing in other components of 

15 the tissue characterization system 1 00. In one embodiment, the frame grabber 120 uses a card 
for video data digitization (video capture) and a card for broadband illumination (for example, 
flash lamps) control. For example, one embodiment uses a Matrox Meteor 2 card for digitization 
and an Imagenation PXC-200F card for illumination control, as shown in block 776 of Figure 28. 
[0389] Real-time (streaming) video images are used for focusing the probe optics 778 as well 

20 as for visual colposcopic monitoring of the patient 780. Single video images provide data for 
calibration 782, motion tracking 784, image mask computation (used in tissue classification) 
786, and, optionally, detection of the presence of a disposable 788. In some illustrative 
embodiments, a single reference video image of the tissue sample is used to compute the image 
masks 108 in the system 100 of Figure 1. This reference image is also used in determining a 

25 brightness and contrast correction and/or other visual enhancement 126, and is used in the 
disease overlay display 138 in Figure 1 . 

[0390] The illustrative video subsystem 770 acquires video data 790 from a single video image 
within about 0.5 seconds. The video subsystem 770 acquires single images in 24-bit RGB 
format and is able to convert them to grayscale images. For example, image mask computation 
30 108 in Figure 1 converts the RGB color triplet data into a single luminance value, Y, (grayscale 
intensity value) at each pixel, where Y is given by Equation 63: 

Y = 0.299R + 0.587G + 0.1 14B (63) 
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where the grayscale intensity component, Y, is expressed in terms of red (R), green (G), and blue 
(B) intensities; and where R, G, and B range from 0 to 255 for a 24-bit RGB image. 
[0391] Laser target focusing 728 is part of the scan procedure in Figure 27A. An operator uses 
a targeting laser in conjunction with real-time video to quickly align and focus the probe 142 
prior to starting a patient scan. In the illustrative embodiment, an operator performs a laser 
"spot" focusing procedure in step 728 of Figure 27A where the operator adjusts the probe 142 to 
align laser spots projected onto the tissue sample. The user adjusts the probe while looking at a 
viewfinder with an overlay indicating the proper position of the laser spots. In one alternative 
embodiment, an operator instead performs a thin-line laser focusing method, where the operator 
adjusts the probe until the laser lines become sufficiently thin. The spot focus method allows for 
faster, more accurate focusing than a line-width-based focusing procedure, since thin laser lines 
can be difficult to detect on tissue, particularly dark tissue or tissue obscured by blood. Quick 
focusing is needed inwder to obtain a scan within the optimal time window following 
application of contrast agent to the tissue; thus, a spot-based laser focusing method is preferable 
to a thin line method, although a thin line focus method may be used in alternative embodiments. 
[0392] A target focus validation procedure 1 22 is part of the tissue characterization system 1 00 
of Figure 1, and determines whether the optical system of the instrument 102 is in focus prior to 
a patient scan. If the system is not in proper focus, the acquired fluorescence and reflectance 
spectra may be erroneous. Achieving proper focus is important to the integrity of the image 
masking 108, real-time tracking 106, and overall tissue classification 132 components of the 
system 100 of Figure 1. 

[0393] The focus system includes one or more target laser(s) that project laser light onto the 
patient sample prior to a scan. In one embodiment, the targeting laser(s) project laser light from 
the probe head 192 toward the sample at a slight angle with respect to the optical axis of the 
probe 142 so that the laser light that strikes the sample moves within the image frame when the 
probe is moved with respect to the focal plane. For example, in one illustrative embodiment, 
four laser spots are directed onto a target such that when the probe 142 moves toward the target 
during focusing, the spots move closer together, toward the center of the image. Similarly, when 
the probe 142 moves away from the target, the spots move further apart within the image frame, 
toward the corners of the image. 

[0394] Figure 29A is a single video image 794 of a target 796 of 1 0% diffuse reflectivity upon 
which a target laser projects a focusing pattern of four laser spots 798, 800, 802, 804. During 
laser target focusing 728 (Figure 27A), an operator views four focus rings that are displayed at 
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predetermined locations, superimposed on the target focusing image. Figure 29B depicts the 
focusing image 794 on the target 796 in Figure 29A with superimposed focus rings 806, 808, 
810, 812. The operator visually examines the relative positions of the laser spots 798, 800, 802, 
804 in relation to the corresponding focus rings 806, 808, 810, 812 while moving the probe head 

5 192 along the optical axis toward or away from the target/tissue sample. When the laser spots lie 
within the focus rings as shown in Figure 29B, the system is within its required focus range. The 
best focus is achieved by aligning the centers of all the laser spots with the corresponding centers 
of the focus rings. Alternatively, spot patterns of one, two, three, five, or more laser spots may 
be used for focus alignment. 

10 [0395] It is generally more difficult to align laser spots that strike a non-flat tissue sample 

target than to align the spots on a flat, uniform target as shown in Figure 29B. In some instances, 
a laser spot projected onto tissue is unclear, indistinct, or invisible. Visual evaluation of focus 
may be subjective and-qualitative. Thus, a target focus validation procedure is useful to insure 
proper focus of a tissue target is achieved. Proper focus allows the comparison of both image 

15 data and spectral data from different instrument units and different operators. 

[0396] In one illustrative embodiment, the system 100 of Figure 1 performs an automatic 
target focus validation procedure using a single focus image. The focus image is a 24-bit RGB 
color image that is obtained before acquisition of spectral data in a patient scan. The focus 
image is obtained with the targeting laser turned on and the broadband lights (white lights) 

20 turned off. Automatic target focus validation includes detecting the locations of the centers of 
visible laser spots and measuring their positions relative to stored, calibrated positions 
("nominal" center positions). Then, the validation procedure applies a decision rule based on the 
number of visible laser spots and their positions and decides whether the system is in focus and a 
spectral scan can be started. 

25 [0397] Figure 30 is a block diagram 8 1 6 of a target focus validation procedure according to an 
illustrative embodiment of the invention. The steps include obtaining a 24-bit RGB focus image 
818, performing image enhancement 820 to highlight the coloration of the laser spots, 
performing morphological image processing (dilation) to fill holes and gaps within the spots 822, 
defining a region of interest (ROI) of the image 824, and computing a mean and standard 

30 deviation 826 of the luminance values (brightness) of pixels within the region of interest. Next, 
the focus validation procedure iteratively and dynamically thresholds 828 the enhanced focus 
image using the computed mean and standard deviation to extract the laser spots. Between 
thresholding iterations, morphological processing 830 disconnects differentiated image objects 
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and removes small image objects from the thresholded binary image, while a region analysis 
procedure 832 identifies and removes image objects located outside the bounds of the target laser 
spot pathways 838 and objects whose size and/or shape do not correspond to a target laser spot 
After all thresholding iterations, the found "spots" are either verified as true target laser spots or 

5 are removed from the image 834, based on size, shape, and/or location. Next, in step 842, the 
focus validation procedure computes how far the centers of the found spots are from the nominal 
focus centers and converts the difference from pixels to millimeters in step 844. The validation 
procedure then applies a decision rule based on the number of found spots and their positions 
and decides whether the system is in focus such that a spectral scan of the patient can begin. 

10 [0398] The focus validation procedure of Figure 30 begins with obtaining the 24-bit RGB 

focus image and splitting it into R, G, and B channels. Each channel has a value in the range of 
0 to 255. Figure 3 1 depicts the RGB focus image 794 from Figure 29A with certain illustrative 
geometry superimposed. Figure 31 shows the four nominal spot focus centers 850, 852, 854, 
856 as red dots, one of which is the red dot labeled tc N" in quadrant 1 . The nominal spot focus 

15 centers represent the ideal location of centers of the projected laser spots, achieved when the 
probe optics are in optimum focus. The nominal spot focus centers 850, 852, 854, 856 
correspond to the centers of the rings 806, 808, 810, 812 in Figure 29B. An (x,y) position is 
determined for each nominal focus center. A nominal image focus center (857), O, is defined by 
the intersection of the two red diagonal lines 858, 860 in Figure 3 1 . The red diagonal lines 858, 

20 860 connect the two pairs of nominal spot focus centers 852, 854 in quadrants 2 and 3 and 850, 
856 in quadrants 1 and 4, respectively. Also, the slopes of the two lines 858, 860 are computed 
for later use. 

[0399] Step 820 in the procedure of Figure 30 is image enhancement to highlight the 
coloration of the laser spots in contrast to the surrounding tissue. In one embodiment, the R 
25 value of saturated spots is "red clipped" such that if R is greater than 1 80 at any pixel, the R 
value is reduced by 50. Then, a measure of greenness, G E , of each pixel is computed as in 
Equation 64: 

Ge = G-R-15 (64) 
where G is the green value of a pixel, R is the red value of the pixel, and 1 5 is a correction factor 
30 to remove low intensity noise, experimentally-determined here to be 1 5 gray levels. 

[0400] Figure 32A represents the green channel of an RGB image 864 of a cervical tissue 
sample, used in an exemplary target focus validation procedure. In this image, only two top 
focus laser spots 868, 870 are clear. The lower right spot 872 is blurred/diffused while the lower 
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lefl spot 874 is obscured. The green-channel luminance (brightness), G E , of the green-enhanced 
RGB image 864 of Figure 32A may be computed using Equation 64 and may be displayed, for 
example, as grayscale luminance values between 0 and 255 at each pixel. 
[0401] In step 822 of Figure 30, the focus validation procedure performs morphological 
5 dilation using a 3x3 square structuring element to fill holes and gaps within the found spots. 
Then in step 824, the procedure uses a pre-defined, circular region of interest (ROT) for 
computing a mean, M, and a standard deviation, STD, 826 of the greenness value, G E , of the 
pixels within the ROI, which are used in iterative dynamic thresholding 828. According to the 
illustrative embodiment, the ROI is a substantially circular region with a 460-pixel diameter 

10 whose center coincides with the nominal image focus center, O. 

[0402] Before iterative dynamic thresholding begins, Ge is set equal to zero at a 50-pixel 
diameter border about the ROI. Then, iterative dynamic thresholding 828 begins by setting an 
iteration variable, p, to zero, then computing a threshold value, Th, as follows: 

Th = M + p-STD (65) 

15 where M and STD are defined from the ROI. Since p=0 in the first iteration, the threshold, Th, is 
a "mean" greenness value over the entire ROI in the first iteration. In this embodiment, image 
thresholding is a subclass of image segmentation that divides an image into two segments. The 
result is a binary image made up of pixels, each pixel having a value of either 0 (off) or 1 (on). 
In step 828 of the focus validation procedure of Figure 30, the enhanced greenness value of a 

20 pixel corresponding to point (x,y), within the ROI, G E (x,y), is compared to the threshold value, 
Th. The threshold is applied as in Equation 66: 

IF G E (x,y) > Th, THEN the binary pixel value at (x,y), B T - 1 , else B T = 0. (66) 
[0403] Iterative dynamic thresholding 828 proceeds by performing morphological opening 830 
to separate nearby distinguishable image objects and to remove small objects of the newly 

25 thresholded binary image. According to the illustrative embodiment, the morphological opening 
830 includes performing an erosion, followed by a dilation, each using a 3x3 square structuring 
element. The procedure then determines the centroid of each of the thresholded objects and 
removes each object whose center is outside the diagonal bands bounded by two lines that are 40 
pixels above and below the diagonal lines 858, 860 in Figure 3 1 . These diagonal bands include 

30 the region between lines 876, 878 and the region between lines 880, 882 in Figure 31, 

determined in step 838 of Figure 30. An image object whose center lies outside these bands does 
not correspond to a target focus spot, since the centers of the focus laser spots should appear 
within these bands at any position of the probe along the optical axis. The spots move closer 
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together, within the bands, as the probe moves closer to the tissue sample, and the spots move 
farther apart, within the bands, as the probe moves away from the tissue sample. 
[0404] Next, step 832 of the thresholding iteration 828 computes an area (A), eccentricity (E), 
and equivalent diameter (ED) of the found image objects, and removes an object whose size 
and/or shape - described here by A, E, and ED - does not correspond to that of a focus laser 
spot. E and ED are defined as follows: 

E = (l-b 2 /a 2 ) a5 (67) 
ED = 2(A/7c) 0 * 5 (68) 
where a is the minor axis length and b is the major axis length in units of pixels. For example, 
step 832 applies Equation 69 as follows: 

IF A > 5000 OR IF E > 0.99 OR IF ED > 1 10, 

THEN remove object (set B T = 0 for all pixels in object). (69) 
Other criteria may be applied. For example, Equation 70 may be applied in place of Equation 
69: 

IF A > 2500 OR IF E> 0.99 OR IF ED>80, 

THEN remove object (set B T = 0 for all pixels in object). (70) 
[0405] Next, the iteration variable, p, is increased by a fixed value, for example, by 0.8, and a 
new threshold is calculated using Equation 65. The iteration proceeds by applying the new 
threshold, performing a morphological opening, computing centroids of the newly thresholded 
regions, removing regions whose center position, size, and/or shape do not correspond to those 
of a target focus spot, and stepping up the value of the iteration variable p. Iterative dynamic 
thresholding proceeds until a pre-determined condition is satisfied. For example, the 
thresholding ends when the following condition is satisfied: 

IF p > 6 OR IF the number of qualified spots (image objects) < 4, THEN STOP. (71) 

[0406] Step 834 of the focus validation procedure eliminates any image object remaining after 
dynamic thresholding that does not meet certain laser spot size and shape criteria. For example, 
according to the illustrative embodiment, step 834 applies the condition in Equation 72 for each 
remaining image object: 

IF A < 80 OR IF E > 0.85 OR IF ED < 1 0, THEN remove object. (72) 
[0407] In an alternative embodiment, one or more additional criteria based on the position of 
each image object (found spot) are applied to eliminate objects that are still within the focus 
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bands of Figure 31, but are too far from the nominal centers 850, 852, 854, 856 to be valid focus 
spots. 

[0408] Figure 32B shows an image 898 of the cervical tissue sample of Figure 32A following 
step 834, wherein the top two image objects were verified as target laser spots, while the bottom 
5 objects were eliminated. 

[0409] Step 842 of the focus validation procedure assigns each of the found spots to its 
respective quadrant and computes the centroid of each found spot. Figure 31 shows the found 
spots as blue dots 900, 902, 904, 906. Then for each found spot, step 842 computes the distance 
between the center of the spot to the nominal image focus center 857, O. For the focus spot 

10 center 900 labeled "F" in Figure 3 1, this distance is L 0 f» the length of the blue line 91 0 from 
point O to point F. The distance between the nominal focus center, N, 850 corresponding to the 
quadrant containing the found spot, and the nominal image focus center 857, O, is Lon, the 
length of the red line 912 from point O to point N. Step 842 of the focus validation procedure 
then determines a focus value for verified focus spot 900 equal to the difference between the 

15 lengths Lof and Lon- The focus value of each of the verified focus spots is computed in this 
manner, and the focus values are converted from pixels to millimeters along the focus axis (z- 
axis) in step 844 of Figure 30 using an empirically-determined conversion ratio - for example, 
0.34 mm per pixel. 

[0410] Next, the focus validation procedure of Figure 30 applies a decision rule in step 846 
20 based on the number of found spots and their positions. The decision rule is a quantitative 
means of deciding whether the system is in focus and a spectral scan of the tissue can begin. 
According to the illustrative embodiment, step 846 applies a decision rule given by Equations 
73, 74, and 75: 

IF 3 or more spots are found, THEN (73) 
25 IF the focus value determined in step 842 is < 6 mm for any 3 spots OR 

IF the focus value is < 4 mm for any 2 spots, 
THEN "Pass", ELSE "Fail" (require refocus). 



30 



IF only 2 spots are found, THEN 

IF the focus value of any spot is > 4 mm, 
THEN "Fail" (require refocus), ELSE "Pass". 
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IF < 1 spot is found, THEN ^ 
"Fail" (require refocus). 

Other decision rules may be used alternatively. 

[0411] Figures 33 and 34 show the application of the focus validation procedure of Figure 3 0 
using a rubber cervix model placed so that the two upper laser spots are within the os region. 
For this example, the distance between the edge of the probe head 192 and the target (or target 
tissue) is approximately 100 mm at optimum focus, and the distance light travels between the 
target (or target tissue) and the first optic within the probe 142 is approximately 130 mm at 
optimum focus. 

[0412] Figure 33 is a 24-bit RGB target laser focus image 942 of a rubber cervix model 944 
onto which four laser spots 946, 948, 950, 952 are projected. The cervix model 944 is off-center 
in the image 942 such that the two upper laser spots 946, 948 lie within the os region. Figure 34 
shows a graph 954 depicting as a function of probe position relative to the target tissue 956, the 
mean of a focus value 958 (in pixels) of each of the four laser spots 946, 948, 950, 952 projected 
onto the rubber cervix model 944. The curve fit 960 of the data indicates the relationship 
between measured focus, f, 958 and probe location, Zp, 956 (in mm) is substantially linear. 
However, the curve is shifted down and is not centered at (0,0). This indicates a focus error 
introduced by the manual alignment used to obtain the z=0 focus position. Such an error may 
prompt a "Fail" determination in step 846 of the focus validation procedure of Figure 30, 
depending on the chosen decision rule. Figure 34 indicates the difficulty in making a visual 
focus judgment to balance the focus of the four spots, particularly where the target surface 
(tissue sample) is not flat and perpendicular to the optical axis (z-axis) of the probe system. 
[0413] The focus validation procedure illustrated in Figure 30 provides an automatic, 
quantitative check of the quality of focus. Additionally, in the illustrative embodiment, the focus 
validation procedure predicts the position of optimum focus and/or automatically focuses the 
optical system accordingly by, for example, triggering a galvanometer subsystem to move the 
probe to the predicted position of optimum focus. 

[0414] The focus validation procedure in Figure 30 produces a final decision in step 846 of 
"Pass" or "Fail" for a given focus image, based on the decision rule given by Equations 73-75. 
This indicates whether the focus achieved for this tissue sample is satisfactory and whether a 
spectral data scan may proceed as shown in step 732 of Figures 27A and 27B. 
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Determining optimal data acquisition window 
[0415] After application of contrast agent 726 and target focusing 728, step 730 of Figure 27A 
indicates that the operator waits for the beginning of the optimum window for obtaining spectral 
data unless the elapsed time already exceeds the start of the window. The optimum window 
5 indicates the best time period for obtaining spectral data, following application of contrast agent 
to the tissue, considering the general time constraints of the entire scan process in a given 
embodiment. For example, according to the illustrative embodiment, it takes from about 12 to 
about 15 seconds to perform a spectral scan of 499 interrogation points of a tissue sample. An 
optimum window is determined such that data may be obtained over a span of time within this 

10 window from a sufficient number of tissue regions to provide an adequately detailed indication 
of disease state with sufficient sensitivity and selectivity. The optimum window preferably, also 
allows the test data to be used, in turn, as reference data in a subsequently developed tissue 
classification moduler According to another feature, the optimum window is wide enough to 
allow for restarts necessitated, for example, by focusing problems or patient movement. Data 

15 obtained within the optimum window can be added to a bank of reference data used by a tissue 
classification scheme, such as component 132 of the system 100 of Figure 1. Thus, the optimum 
window is preferably narrow enough so that data from a given region is sufficiently consistent 
regardless of when, within the optimum window, it is obtained. 

[0416] According to the illustrative embodiment, the optimal window for obtaining spectral 
20 data in step 104 of Figure 1 is a period of time from about 30 seconds following application of 
the contrast agent to about 130 seconds following application of the contrast agent. The time it 
takes an operator to apply contrast agent to the tissue sample may vary, but is preferably between 
about 5 seconds and about 10 seconds. The operator creates a time stamp in the illustrative scan 
procedure of Figure 27A after completing application of the contrast agent, and then waits 30 
25 seconds before a scan may begin, where the optimum window is between about 30 seconds and 
about 130 seconds following application of contrast agent. If the scan takes from about 12 
seconds to about 15 seconds to complete (where no retake is required), the start of the scan 
procedure must begin soon enough to allow all the data to be obtained within the optimum 
window. In other words, in this embodiment, the scan must begin at least before 1 1 5 (assuming 
30 a worst case of 1 5 seconds to complete the scan) seconds following the time stamp (115 seconds 
after application of contrast agent) so that the scan is completed by 130 seconds following 
application of contrast agent. Other optimum windows may be used. In one embodiment, the 
optimum window is between about 30 seconds and about 110 seconds following application of 
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contrast agent. One alternative embodiment has an optimal window with a "start" time from 
about 10 to about 60 seconds following application of acetic acid, and an "end" time from about 
1 10 to about 180 seconds following application of acetic acid. Other optimum windows may be 
used. 

[0417] In one illustrative embodiment, the tissue characterization system 100 of Figure 1 
includes identifying an optimal window for a given application, and/or subsequently using 
spectral data obtained within the pre-determined window in a tissue classification module, such 
as step 132 of Figure 1. According to one feature, optimal windows are determined by obtaining 
optical signals from reference tissue samples with known states of health at various times 
following application of a contrast agent. 

[0418] Determining an optimal window illustratively includes the steps of obtaining a first set 
of optical signals from tissue samples having a known disease state, such as CIN 2/3 (grades 2 
and/or 3 cervical intraepithelial neoplasia); obtaining a second set of optical signals from tissue 
samples having a different state of health, such as non-CIN 2/3; and categorizing each optical 
signal into "bins" according to the time it was obtained in relation to the time of application of 
contrast agent. The optical signal may include, for example, a reflectance spectrum, a 
fluorescence spectrum, a video image intensity signal, or any combination of these. 
[0419] A measure of the difference between the optical signals associated with the two types 
of tissue is then obtained, for example, by determining a mean signal as a function of wavelength 
for each of the two types of tissue samples for each time bin, and using a discrimination function 
to determine a weighted measure of difference between the two mean optical signals obtained 
within a given time bin. This provides a measure of the difference between the mean optical 
signals of the two categories of tissue samples - diseased and healthy - weighted by the variance 
between optical signals of samples within each of the two categories, 

[0420] According to the illustrative embodiment, the invention further includes developing a 
classification model for each time bin for the purpose of determining an optimal window for 
obtaining spectral data in step 104ofFigure 1. After determining a measure of difference 
between the tissue types in each bin, an optimal window of time for differentiating between 
tissue types is determined by identifying at least one bin in which the measure of difference 
between the two tissue types is substantially maximized. For example, an optimal window of 
time may be chosen to include every time bin in which a respective classification model provides 
an accuracy of 70% or greater. Here, the optimal window describes a period of time following 
application of a contrast agent in which an optical signal can be obtained for purposes of 
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classifying the state of health of the tissue sample with an accuracy of at least 70%. Models 
distinguishing between three or more categories of tissue may also be used in determining an 
optimal window for obtaining spectral data. As discussed below, other factors may also be 
considered in determining the optimal window. 

[0421] An analogous embodiment includes determining an optimal threshold or range of a 
measure of change of an optical signal to use in obtaining (or triggering the acquisition of) the 
same or a different signal for predicting the state of health of the sample. Instead of determining 
a specific, fixed window of time, this embodiment includes determining an optimal threshold of 
change in a signal, such as a video image whiteness intensity signal, after which an optical 
signal, such as a diffuse reflectance spectrum and/or a fluorescence spectrum, can be obtained to 
accurately characterize the state of health or other characteristic of the sample. This illustrative 
embodiment includes monitoring reflectance and/or fluorescence at a single or multiple 
wavelengths), and upon reaching a threshold change from the initial condition, obtaining a full 
reflectance and/or fluorescence spectrum for use in diagnosing the region of tissue. This method 
allows for reduced data retrieval and monitoring, since it involves continuous tracking of a 
single, partial-spectrum or discrete-wavelength "trigger" signal (instead of multiple, full- 
spectrum scans), followed by the acquisition of spectral data in a spectral scan for use in tissue 
characterization, for example, the tissue classification module 132 of Figure 1. Alternatively, the 
trigger may include more than one discrete-wavelength or partial-spectrum signal. The measure 
of change used to trigger obtaining one or more optical signals for tissue classification may be a 
weighted measure, and/or it may be a combination of measures of change of more than one 
signal. 

[0422] In a further illustrative embodiment, instead of determining an optimal threshold or 
range of a measure of change of an optical signal, an optimal threshold or range of a measure of 
the rate of change of an optical signal is determined. For example, the rate of change of 
reflectance and/or fluorescence is monitored at a single or multiple wavelength(s), and upon 
reaching a threshold rate of change, a spectral scan is performed to provide spectral data for use 
in diagnosing the region of tissue. The measure of rate of change used to trigger obtaining one 
or more optical signals for tissue classification may be a weighted measure, and/or it may be a 
combination of measures of change of more than one signal. For example, the measured rate of 
change may be weighted by an initial signal intensity. 

[0423] According to the illustrative embodiment, the optimum time window includes a time 
window in which spectra from cervical tissue may be obtained such that sites indicative of 
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grades 2 and 3 cervical intraepithelial neoplasia (CIN 2/3) can be separated from non-CIN 2/3 
sites. Non-CIN 2/3 sites include sites with grade 1 cervical intraepithelial neoplasia (CIN 1), as 
well as NED sites, normal columnar and normal squamous epithelia, and mature and immature 
metaplasia. Alternately, sites indicative of high grade disease, CIN 2+, which includes CIN 2/3 
categories, carcinoma in situ (CIS), and cancer, may be separated from non-high-grade-disease 
sites. In general, for any embodiment discussed herein in which CIN 2/3 is used as a category 
for classification or characterization of tissue, the more expansive category CIN 2+ may be used 
alternatively. Preferably, the system 100 can differentiate amongst three or more classification 
categories. Exemplary embodiments are described below and include analysis of the time 
response of diffuse reflectance and/or 337-nm fluorescence spectra of a set of reference tissue 
samples with regions having known states of health to determine temporal characteristics 
indicative of the respective states of health. These characteristics are then used in building a 
model to determine a State of health of an unknown tissue sample. Other illustrative 
embodiments include analysis of fluorescence spectra using other excitation wavelengths, such 
as 380 nm and 460 nm, for example. 

10424] According to one illustrative embodiment, an optimum window is determined by 
tracking the difference between spectral data of two tissue types using a discrimination function. 
[0425] According to the illustrative embodiment, the discrimination function shown below in 
Equation 76 may be used to extract differences between tissue types: 



n (2) - M{test{X)) nm _ CINin - M {test(X)) r ,^ 

V<r 2 (^a)L,. c/W2/3 + a 2 (test(Z)) cm2n (76) 

where ^ corresponds to the mean optical signal for the tissue type indicated in the subscript; and 
a corresponds to the standard deviation. The categories CIN 2/3 and non-CIN 2/3 are used in 
this embodiment because spectral data is particularly well-suited for differentiating between 
these two categories of tissue, and because spectral data is prominently used in one embodiment 
of the classification schema in the tissue classification module in step 132 of Figure 1 to identify 
CIN 2/3 tissue. Thus, in this way, it is possible to tailor the choice of an optimal scan window 
such that spectral data obtained within that window are well-adapted for use in identifying CIN 
2/3 tissue in the tissue classification scheme 132. In one illustrative embodiment, the optical 
signal in Equation 76 includes diffuse reflectance. In another illustrative embodiment, the 
optical signal includes 337-nm fluorescence emission spectra. Other illustrative embodiments 
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use fluorescence emission spectra at another excitation wavelength such as 380 nm and 460 nm. 
In still other illustrative embodiments, the optical signal is a video signal, Raman signal, or 
infrared signal. Some illustrative embodiments include using difference spectra calculated 
between different phases of acetowhitening, using various normalization schema, and/or using 
various combinations of spectral data and/or image data as discussed above. 
[0426] In one preferred embodiment, determining an optimal window for obtaining spectral 
data in step 104 of Figure 1 includes developing linear discriminant analysis models using 
spectra from each time bin shown in Table 1 below. 

Tabl e 1: Time bins for which means spectra are obtained in an exemplary embodiment 



Bin 


Time after application of Acetic Acid (s\ 


1 


t<0 


2 


0<t<40 


3 


40<t<60 


4 


60 <t <80 


5 


80<t<100 


6 


100<t<120 


7 


120<t< 140 


8 


140 <t< 160 


9 


160<t<180 


10 


t> 180 



[0427] Alternatively, nonlinear discriminant analysis models may be developed. Generally, 
models for the determination of an optimal window are trained using reflectance and 
fluorescence data separately, although some embodiments include using both data types to train 
a model. The discriminant analysis models discussed herein for exemplary embodiments of the 
determination of an optimal window are generally less sophisticated than the schema used in the 
tissue classification module 132 in Figure 1 . Alternatively, a model based on the tissue 
classification schema in the module 132 in Figure 1 can be used to determine an optimal window 
for obtaining spectral data in step 104 of Figure 1. 

[0428] In exemplary embodiments for detennining an optimal window discussed herein, 
reflectance and fluorescence intensities are down-sampled to one value every 10 nm between 
360 and 720 nm. A model is trained by adding and removing intensities in a forward manner, 
continuously repeating the process until the model converges such that additional intensities do 
not appreciably improve tissue classification. Testing is performed by a leave-one-spectrum-out 
jack-knife process. 
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[0429] Figure 35 shows the difference between the mean reflectance spectra for non-CIN 2/3 
tissues and CIN 2/3 tissues at three times (prior to the application of acetic acid (graph 976), 
maximum whitening (graph 978, about 60 - 80 seconds post-AA), and the last time data were 
obtained (graph 980, about 160 - 180 seconds post-AA)). The time corresponding to maximum 

5 whitening was determined from reflectance data, and occurs between about 60 seconds and 80 
seconds following application of acetic acid. In the absence of acetic acid, the reflectance 
spectra for CIN 2/3 (curve 982 of graph 976 in Figure 35) are on average lower than non-CIN 
2/3 tissue (curve 984 of graph 976 in Figure 35). Following the application of acetic acid, a 
reversal is noted - CIN 2/3 tissues have higher reflectance than the non-CIN 2/3 tissues. The 

10 reflectance of CIN 2/3 and non-CIN 2/3 tissues increase with acetic acid, with CIN 2/3 showing 
a larger relative percent change (compare curves 986 and 988 of graph 978 in Figure 35). From 
about 160 s to about 180 s following acetic acid, the reflectance of CIN 2/3 tissue begins to 
return to the pre-acetic acid state, while the reflectance of the non-CIN 2/3 group continues to 
increase (compare curves 990 and 992 of graph 980 in Figure 35). 

15 [0430] Discrimination function 'spectra' are calculated from the reflectance spectra of CIN 2/3 
and non-CIN 2/3 tissues shown in Figure 35 as one way to determine an optimal window for 
obtaining spectral data. Discrimination function spectra comprise values of the discrimination 
function in Equation 76 determined as a function of wavelength for sets of spectral data obtained 
at various times. As shown in Figure 36, the largest differences (measured by the largest 

20 absolute values of discrimination function) are found about 60 s to about 80s post-acetic acid 
(curve 1002), and these data agree with the differences seen in the mean reflectance spectra of 
Figure 35 (curves 986 and 988 of graph 978 in Figure 35). 

[0431] Multivariate linear regression analysis takes into account wavelength interdependencies 
in determining an optimal data acquisition window. One way to do this is to classify spectral 

25 data shown in Figure 35 using a model developed from the reflectance data for each of the bins 
in Table 1. Then, the accuracy of the models for each bin is computed and compared. 
Reflectance intensities are down-sampled to one about every 10 nm between about 360 nm and 
about 720 nm. The model is trained by adding intensities in a forward-stepped manner. Testing 
is performed with a leave-one-spectrum-out jack-knife process. The results of the linear 

30 regression show which wavelengths best separate CIN 2/3 from non-CIN 2/3, as shown in Table 
2. 
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Table 2 ; Forwarded selected best reflectance wavelengths for classifying CIN 2/3 
from non-CIN 2/3 spectra obtained at different times pre and post-AA. 



Time from AA (s) 


LDA Model Input Wavelengths (nm) 


Accuracy 


-30 


370 400 420 440 530 570 590 610 


66 


30 


420 430 450 600 


74 


50 


360 400 420 430 580 600 


74 




90 


360 420 430 540 590 


73 


110 


360 440 530 540 590 


71 


130 


360 420 430 540 590 


71 


150 


370 400 430 440 540 620 660 690 720 


72 


170 


490 530 570 630 650 


75 , 



5 [0432] As shown in Table 2, the two best models for separating CIN 2/3 and non-CIN 2/3, 
taking into account wavelength interdependence, use reflectance data obtained at peak CIN 2/3 
whitening (from about 60s to about 80s) and reflectance data obtained from about 1 60s to about 
180s post acetic acid. The first model uses input wavelengths between about 360 and about 600 
nm, while the second model uses more red-shifted wavelengths between about 490 and about 

10 650 nm. This analysis shows that the optimal windows are about 60s-80s post AA and about 
160-1 80 post AA (the latest time bin). This is consistent with the behavior of the discrimination 
function spectra shown in Figure 6. 

[0433] Figure 37 demonstrates one step in determining an optimal window for obtaining 
spectral data, for purposes of discriminating between CIN 2/3 and non-CIN 2/3 tissue. Figure 37 

15 shows a graph 1006 depicting the performance of the two LDA models described in Table 2 
above as applied to reflectance spectral data obtained at various times following application of 
acetic acid 1008. Curve 1010 in Figure 37 is a plot of the diagnostic accuracy of the LDA model 
based on reflectance spectral data obtained between about 60 and about 80 seconds ("peak 
whitening model") as applied to reflectance spectra from the bins of Table 1, and curve 1012 in 

20 Figure 37 is a plot of the diagnostic accuracy of the LDA model based on reflectance spectral 
data obtained between about 160 and about 180 seconds, as applied to reflectance spectra from 
the bins of Table 1. For the peak-whitening model, the highest accuracy was obtained at about 
70 s, while accuracies greater than 70% were obtained with spectra collected in a window 
between about 30s and about 130s. The 160-180 s model had a narrower window around 70 s, 

25 but performs better at longer times. 
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[0434] Figure 3 8 shows the difference between the mean 337-nm fluorescence spectra for non- 
CIN 2/3 tissues and CIN 2/3 tissues at three times (prior to application of acetic acid (graph 
1014) , maximum whitening (graph 1016, about 60 to about 80 seconds post-AA), and at a time 
corresponding to the latest time period in which data was obtained (graph 1018, about 160 to 
5 about 180 seconds post-AA)). The time corresponding to maximum whitening was determined 
from reflectance data, and occurs between about 60 seconds and 80 seconds following 
application of acetic acid. In the absence of acetic acid, the fluorescence spectra for CIN 2/3 
tissue (curve 1020 of graph 1014 in Figure 38) and for non-CIN 2/3 tissue (curve 1022 of graph 
1014 in Figure 38) are essentially equivalent with a slightly lower fluorescence noted around 390 

10 nm for CIN 2/3 sites. Following the application of acetic acid, the fluorescence of CIN 2/3 and 
non-CIN 2/3 tissues decrease, with CIN 2/3 showing a larger relative percent change (compare 
curves 1024 and 1026 of graph 1016 in Figure 38). From about 160s to about 180 s following 
acetic acid application, the fluorescence of CIN 2/3 tissue shows signs of returning to the pre- 
acetic acid state while the fluorescence of the non-CIN 2/3 group continues to decrease (compare 

15 curves 1028 and 1030 of graph 1018 in Figure 38). 

[0435] An optimal data acquisition window may also be obtained using a discrimination 
function calculated from fluorescence spectra of CIN 2/3 and non-CIN 2/3 tissues shown in 
Figure 38. In one example, discrimination function spectra include values of the discrimination 
function in Equation 76 determined as a function of wavelength for sets of spectral data obtained 

20 at various times. Figure 39 shows a graph 1032 depicting the discrimination function spectra 
evaluated using the fluorescence data of Figure 38 obtained prior to application of acetic acid, 
and at two times post-AA. As shown in Figure 39, applications of acetic acid improves that 
distinction between CIN 2/3 and non-CIN 2/3 tissues using fluorescence data. The largest 
absolute values are found using data measured within the range of about 160-180 s post-acetic 

25 acid (curve 1042), and these agree with the differences seen in the mean fluorescence spectra of 
Figure 38 (curves 1030 and 1028 of graph 1018 in Figure 38). 

[0436] Multivariate linear regression takes into account wavelength interdependencies in 
determining an optimal data acquisition window. An application of one method of determining 
an optimal window includes classifying data represented in the CIN 2/3, CIN 1, and NED 
30 categories in the Appendix Table into CIN 2/3 and non-CIN 2/3 categories by using 

classification models developed from the fluorescence data shown in Figure 38. Fluorescence 
intensities are down-sampled to one about every 10 nm between about 360 and about 720 nm. 
The model is trained by adding intensities in a forward maimer. Testing is performed by a leave- 
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one-spectrum-out jack-knife process. The result of this analysis shows which wavelengths best 
separate ON 2/3 from non-CIN 2/3, as shown in Table 3. 

Table 3 : Forwarded selected best 337-nm fluorescence wavelengths for 
classifying CIN 2/3 from non-CIN 2/3 spectra obtained at different times pre 
and post-AA. 

Time from AA (s) LDA Model Input Wavelengths (nm) Accuracy 
130 380, 430, 440, 610, 660, 700, 710 61 
30 370, 380, 390, 640 61 
50 410 54 



90 370,380,420,460,500,560,660 64 

110 360,390,400,710 51 

130 370 53 

1 50 370, 380, 440, 620, 640, 700 65 

170 370, 480, 510, 570, 600, 700, 720 76 

10 [0437] As shown in Table 3, the two best models for separating CIN 2/3 and non-CIN 2/3, 
taking into account wavelength interdependences, use data obtained at peak CIN 2/3 whitening 
(60-80 s) and data obtained at the latest time measured (from about 160s to about 180 s post 
acetic acid). The first model uses input wavelengths between about 360 and about 670 nm, 
while the second model uses wavelengths between about 370 and about 720 nm. 

15 [0438] Figure 40 demonstrates one step in determining an optimal window. Figure 40 shows a 
graph 1044 depicting the performance of the two LDA models described in Table 3 above as 
applied to fluorescence spectral data obtained at various times following application of acetic 
acid 1046. Curve 1048 in Figure 40 is a plot of the diagnostic accuracy of the LDA model based 
on fluorescence spectral data obtained between about 60 and about 80 seconds ("peak whitening 

20 model") as applied to fluorescence spectra from the bins of Table 1, and curve 1050 in Figure 40 
is a plot of the diagnostic accuracy of the LDA model based on fluorescence spectral data 
obtained between about 160 and about 180 seconds, as applied to fluorescence spectra from the 
bins of Table 1 . The accuracies of these models vary depending on when the fluorescence 
spectra are recorded relative to the application of acetic acid, as shown in Figure 40. The 

25 predictive ability of the fluorescence models in Figure 40 tend to be less than that of the 
reflectance models in Figure 37. Accuracies greater than 70% are obtained with spectra 
collected after about 160 seconds post-AA. 

[0439] One embodiment includes classifying spectral data shown in Figure 38 from known 
reference tissue samples into CIN 2/3 and non-CIN 2/3 categories by using classification models 
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developed from the fluorescence data for each of the bins in Table 1 . Models are developed 
based on time post acetic acid. Ratios of fluorescence to reflectance are down-sampled to one 
every 10 nm between about 360 and about 720 nm. The model is trained by adding intensities in 
a forward manner. Testing is performed by a leave-one-spectrum-out jack-knife process. For 

5 this analysis, the model is based on intensities at about 360, 400, 420, 430, 560, 610, and 630 
nm. In general, the results are slightly better than a model based on fluorescence alone. 
Improved performance is noted from spectra acquired at about 160 s post acetic acid. 
[0440] Figure 41 shows a graph 1052 depicting the accuracy of three LDA models as applied 
to spectral data obtained at various times following application of acetic acid 1054, used in 

10 determining an optimal window for obtaining spectral data. Curve 1056 in Figure 41 is a plot of 
the diagnostic accuracy of the LDA model based on reflectance spectral data obtained between 
about 60 and about 80 seconds ("peak whitening model"), also shown as curve 1010 in Figure 
37. Curve 1058 in Figure 41 is a plot of the diagnostic accuracy of the LDA model based on 
fluorescence spectral data obtained between about 60 and about 80 seconds ("peak whitening 

15 model"), also shown as curve 1048 in Figure 40. Curve 1060 in Figure 41 is a plot of the 

diagnostic accuracy of the LDA model based on fluorescence intensity divided by reflectance, as 
described in the immediately preceding paragraph. 

[0441] The exemplary embodiments discussed above and illustrated in Figures 35 to 41 
provide a basis for selecting an optimum window for obtaining spectral data upon application of 

20 acetic acid. Other factors to be considered include the time required to apply the contrast agent 
and to perform target focusing as shown in Figure 27A. Another factor is the time required to 
perform a scan over a sufficient number of regions of a tissue sample to provide an adequate 
indication of disease state with sufficient sensitivity and selectivity. Also, a consideration may 
be made for the likelihood of the need for and time required for retakes due to patient motion. 

25 [0442] The factors and analysis discussed above indicate that an optimal data acquisition 

window is a period of time from about 30 seconds following application of a contrast agent (for 
example, a 5 volume percent acetic acid solution) to about 130 seconds following application of 
the contrast agent. Other optimal windows are possible. For example, one alternative 
embodiment uses an optimal window with a "start" time from about 10 to about 60 seconds 

30 following application of acetic acid, and an "end" time from about 1 10 to about 1 80 seconds 
following application of acetic acid. 

[0443] An alternative maimer for determining an optimal window comprises determining and 
using a relative amplitude change and/or rate of amplitude change as a trigger for obtaining 
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spectral data from a sample. By using statistical and/or heuristic methods such as those discussed 
herein, it is possible to relate more easily-monitored relative changes or rates-of-change of one or 
more optical signals from a tissue sample to corresponding full spectrum signals that can be used 
in characterizing the state of health of a given sample. For example, by performing a 

5 discrimination function analysis, it may be found for a given tissue type that when the relative 
change in reflectance at a particular wavelength exceeds a threshold value, the corresponding 
full-spectrum reflectance can be obtained and then used to accurately classify the state of health 
of the tissue. In addition, the triggers determined above may be converted into optimal time 
windows for obtaining diagnostic optical data from a sample. 

10 [0444] Figure 42 shows how an optical amplitude trigger is used to determine an optimal time 
window for obtaining diagnostic optical data The graph 1062 in Figure 42 plots the normalized 
relative change of mean reflectance signal 1064 from tissue samples with a given state of health 
as a function of time following application of acetic acid 1066. The mean reflectance signal 
determined from CIN 1, CIN 2, and Metaplasia samples are depicted in Figure 42 by curves 

15 1068, 1070, and 1072, respectively. Figure 42 shows that when the normalized relative change 
of mean reflectance reaches or exceeds 0.75 in this example, the image intensity data and/or the 
full reflectance and/or fluorescence spectrum is most indicative of a given state of health of a 
sample. Thus, for CIN 2 samples, for example, this corresponds to a time period between ti and 
t 2 , as shown in the graph 1062 of Figure 42. Therefore, spectral and/or image data obtained from 

20 a tissue sample between ti and t 2 following application of acetic acid are used in accurately 

determining whether or not CIN 2 is indicated for that sample. In one embodiment, the relative 
change of reflectance of a tissue sample at one or more given wavelengths is monitored. When 
that relative change is greater than or equal to the 0,75 threshold, for example, more 
comprehensive spectral and/or image data are obtained to characterize whether the sample is 

25 indicative of CIN 2. In another embodiment, a predetermined range of values of the relative 
optical signal change is used such that when the relative signal change falls within the 
predetermined range of values, additional spectral and/or image data is captured in order to 
characterize the sample. 

[0445] Figure 43 shows how a rate-of-change of an optical amplitude trigger is used to 
30 determine an optimal time window for obtaining diagnostic optical data. The graph 1074 of 

Figure 43 plots the slope of an exemplary mean reflectance signal 1076 from tissue samples with 
a given state of health as a function of time following application of acetic acid 1078. The slope 
of mean reflectance is a measure of the rate of change of the mean reflectance signal. The rate of 
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change of mean reflectance determined from CIN 1, CIN 2, and metaplasia samples are depicted 
in Figure 43 by curves 1080, 1082, and 1084, respectively. Those curves show that when the 
absolute value of the slope is less than or equal to 0.1, for example, in the vicinity of maximum 
reflectance, the image intensity data and/or the full reflectance and/or fluorescence spectrum is 
most indicative of a given state of health of a sample. Thus, for CIN 2 samples, for example, this 
corresponds to a time period between ti and t 2 as shown in the graph 1074 of Figure 43. 
Therefore, spectral and/or image data obtained from a tissue sample between ti and t 2 following 
application of acetic acid is used in accurately determining whether or not CIN 2 is indicated for 
that sample. In the example, the rate of change of reflectance of a tissue sample is monitored at 
one or more wavelengths. When that rate of change has an absolute value less than or equal to 
0. 1 , more comprehensive spectral and/or image data are obtained from the sample for purposes 
of characterizing whether or not the sample is indicative of CIN 2. Figure 43 demonstrates use 
of a range of values of rate of optical signal change. Other embodiments use a single threshold 
value. 

Motion tracking 

[0446] In one embodiment, the tissue characterization system shown in Figure 1 comprises 
real-time motion tracking (step 106 in Figure 1). Real-time tracking determines a correction for 
and/or compensates for a misalignment between two images of the tissue sample obtained during 
a spectral data scan (i.e. step 732 in Figure 27A and 27B), where the misalignment is caused by a 
shift in the position of the sample with respect to the instrument 102 in Figure 1 (or, more 
particularly, the probe optics 178). The misalignment may be caused by unavoidable patient 
motion, such as motion due to breathing during the spectral data scan 732. 
[0447] In one embodiment, the correction factor determined by the real-time tracker is used to 
automatically compensate for patient motion, for example, by adjusting the instrument 102 
(Figure 1) so that spectral data obtained from indexed regions of the tissue sample during the 
scan correspond to their originally-indexed locations. Alternatively or additionally, the motion 
correction factor can be used in spectral data pre-processing, step 1 14 in Figure 1 and Figure 1 1, 
to correct spectral data obtained during a scan according to an applicable correction factor. For 
example, the spectral data lookup method in step 1 14 of Figure 1 as discussed herein may 
compensate for patient motion by using a correction determined by the real-time tracker 106 to 
correlate a set of spectral data obtained during a scan with its true, motion-corrected position 
(x,y) on the tissue sample. In one embodiment, the motion correction factor determined in step 
106 of Figure 1 is updated about once every second during the scan using successive images of 
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the tissue, as shown in Figure 27B. Step 106 determines and validates a motion correction factor 
about once every second during the spectral scan, corresponding to each successive image in 
Figure 27B. Then, the pre-processing component 1 14 of Figure 1 corrects the spectral data 
obtained at an interrogation point during the spectral scan using the correction factor 
5 corresponding to the time at which the spectral data were obtained. 

[0448] A typical misalignment between two images obtained about 1 second apart is less than 
about 0.55-mm within a two-dimensional, 480 x 500 pixel image frame field covering a tissue 
area of approximately 25-mm x 25-mm. These dimensions provide an example of the relative 
scale of misalignment versus image size. In some instances it is only necessary to compensate 

10 for misalignments of less than about one millimeter within the exemplary image frame field 

defined above. In other cases, it is necessary to compensate for misalignments of less than about 
0.3-mm within the exemplary image frame field above. Also, the dimensions represented by the 
image frame field, the number of pixels of the image frame field, and/or the pixel resolution may 
differ from the values shown above. 

15 [0449] A misalignment correction determination may be inaccurate, for example, due to any 
one or a combination of the following: non-translational sample motion such as rotational 
motion, local deformation, and/or warping; changing features of a sample such as whitening of 
tissue; and image recording problems such as focus adjustment, missing images, blurred or 
distorted images, low signal-to-noise ratio, and computational artifacts. Validation procedures of 

20 the invention identify such inaccuracies. The methods of validation may be conducted "on-the- 
fly" in concert with the methods of determining misalignment corrections in order to improve 
accuracy and to reduce the time required to conduct a given test. 

[0450] In order to facilitate the automatic analysis in the tissue classification system 100 of 
Figure 1, it is often necessary to adjust for misalignments caused by tissue sample movement 

25 that occurs during the diagnostic procedure. For example, during a given procedure, in vivo 
tissue may spatially shift within the image frame field from one image to the next due to 
movement of the patient. Accurate tissue characterization requires that this movement be taken 
into account in the automated analysis of the tissue sample. In one embodiment, spatial shift 
correction made throughout a spectral data scan is more accurate than a correction made after the 

30 scan is complete, since "on-the-fly" corrections compensate for smaller shifts occurring over 
shorter periods of time and since spectral data is being continuously obtained throughout the 
approximately 12 to 15 second scan in the embodiment of Figure 27B. 
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[0451] If a sample moves while a sequence of images is obtained, the procedure may have to 
be repeated. For example, this may be because the shift between consecutive images is too large 
to be accurately compensated for, or because a region of interest moves outside of a usable 
portion of the frame captured by the optical signal detection device. Stepwise motion correction 
5 of spectral data reduces the cumulative effect of sample movement. If correction is made only 
after an entire sequence is obtained, it may not be possible to accurately compensate for some 
types of sample movement. On-the-fly, stepwise compensation for misalignment reduces die 
need for retakes. 

[0452] On-the-fly compensation may also obviate the need to obtain an entire sequence of 
10 images before making the decision to abort a failed procedure, particularly when coupled with 
on-the-fly, stepwise validation of the misalignment correction determination. For example, if the 
validation procedure detects that a misalignment correction determination is either too large for 
adequate compensation to be made or is invalid, the procedure may be aborted before obtaining 
the entire sequence of images. It can be immediately determined whether or not the obtained 
15 data is useable. Retakes may be performed during the same patient visit; no follow-up visit to 
repeat an erroneous test is required. A diagnostic test invalidated by excessive movement of the 
patient may be aborted before obtaining the entire sequence of images, and a new scan may be 
completed, as long as there is enough remaining time in the optimal time window for obtaining 
spectral data. 

20 [0453] In preferred embodiments, a determination of misalignment correction is expressed as a 
translational displacement in two dimensions, x and y. Here, x and y represent Cartesian 
coordinates indicating displacement on the image frame field plane. In other embodiments, 
corrections for misalignment are expressed in terms of non-Cartesian coordinate systems, such as 
biradical, spherical, and cylindrical coordinate systems, among others. Alternatives to Cartesian- 

25 coordinate systems may be useful, for example, where the image frame field is non-planar. 
[0454] Some types of sample motion - including rotational motion, warping, and local 
deformation -- may result in an invalid misalignment correction determination, since it may be 
impossible to express certain instances of these types of sample motion in terms of a translational 
displacement, for example, in the two Cartesian coordinates x and y. It is noted, however, that in 

30 some embodiments, rotational motion, warping, local deformation, and/or other kinds of non- 
translational motion are acceptably accounted for by a correction expressed in terms of a 
translational displacement. The changing features of the tissue, as in aceto whitening, may also 
affect the determination of a misalignment correction. Image recording problems such as focus 
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adjustment, missing images, blurred or distorted images, low signal-to-noise ratio (i.e. caused by 
glare), and computational artifacts may affect the correction determination as well. Therefore, 
validation of a determined correction is often required. In some embodiments, a validation step 
includes determining whether an individual correction for misalignment is erroneous, as well as 
5 determining whether to abort or continue the test in progress. Generally, validation comprises 
splitting at least a portion of each of a pair of images into smaller, corresponding units 
(subimages), determining for each of these smaller units a measure of the displacement that 
occurs within the unit between the two images, and comparing the unit displacements to the 
overall displacement between the two images. 

10 [0455] In certain embodiments, the method of validation takes into account the fact that 
features of a tissue sample may change during the capture of a sequence of images. For 
example, the optical intensity of certain regions of tissue change during the approximately 12 to 
15 seconds of a scan, due to acetowhitening of the tissue. Therefore, in one embodiment, 
validation of a misalignment correction determination is performed using a pair of consecutive 

15 images. In this way, the difference between the corresponding validation cells of the two 
consecutive images is less affected by gradual tissue whitening changes, as compared with 
images obtained further apart in time. In an alternative embodiment, validation is performed 
using pairs of nonconsecutive images taken within a relatively short period of time, compared 
with the time in which the overall sequence of images is obtained. In other embodiments, 

20 validation comprises the use of any two images in the sequence of images. 

[0456] A determination of misalignment correction between two images is inadequate if 
significant portions of the images are featureless or have low signal-to-noise ratio (i.e. are 
affected by glare). Similarly, validation using cells containing significant portions that are 
featureless or that have low signal-to-noise ratio may result in the erroneous invalidation of valid 

25 misalignment correction determinations. This may occur in cases where the featureless portion 
of the overall image is small enough so that it does not adversely affect the misalignment 
correction determination. For example, analysis of featureless validation cells may produce 
meaningless correlation coefficients. One embodiment includes identifying one or more 
featureless cells and eliminating them from consideration in the validation of a misalignment 

30 correction determination, thereby preventing rejection of a good misalignment correction. 

[0457] A determination of misalignment correction may be erroneous due to a computational 
artifact of data filtering at the image borders. For example, in one exemplary embodiment, an 
image with large intensity differences between the upper and lower borders and/or the left and 
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right borders of the image frame field undergoes Laplacian of Gaussian frequency domain 
filtering. Since Laplacian of Gaussian frequency domain filtering corresponds to cyclic 
convolution in the space-time domain, these intensity differences (discontinuities), yield a large 
gradient value at the image border, and cause the overall misalignment correction determination 
5 to be erroneous, since changes between the two images due to spatial shift are dwarfed by the 
edge effects. One alternative embodiment employs pre-multiplication of image data by a 
Hamming window to remove or reduce this "wraparound error." However, one preferred 
embodiment employs an image-blending technique such as feathering, to smooth any border 
discontinuity, while requiring only a minimal amount of additional processing time. 

10 [0458] Figure 44A represents a 480 x 500 pixel image 1086 from a sequence of images of in 
vivo human cervix tissue and shows a 256 x 256 pixel portion 1088 of the image that the motion 
correction step 106 in Figure 1 uses in identifying a misalignment correction between two 
images from a sequence of images of the tissue, according to one embodiment The image 1 086 
of Figure 44A has a pixel resolution of about 0.054-mm. The embodiments described herein 

15 show images with pixel resolutions of about 0,0547-mm to about 0.0537-mm. Other 

embodiments have pixel resolutions outside this range. In some embodiments, the images of a 
sequence have an average pixel resolution of between about 0.044-mm and about 0.064-mm. In 
the embodiment of Figure 44A, step 106 of the system of Figure 1 uses the central 256 x 256 
pixels 1088 of the image 1086 for motion tracking. An alternative embodiment uses a region of 

20 different size for motion tracking, which may or may not be located in the center of the image 
frame field. In the embodiment of Figure 44A, the motion tracking step 106 of Figure 1 
determines an x-displacement and a y-displacement corresponding to the translational shift 
(misalignment) between the 256 x 256 central portions 1088 of two images in the sequence of 
images obtained during a patient spectral scan. 

25 [0459] The determination of misalignment correction may be erroneous for any number of 
various reasons, including but not limited to non-translational sample motion (i.e. rotational 
motion, local deformation, and/or warping), changing features of a sample (i.e. whitening of 
tissue), and image recording problems such as focus adjustment, missing images, blurred or 
distorted images, low signal-to-noise ratio, and computational artifacts. Therefore, in preferred 

30 embodiments, validation comprises splitting an image into smaller units (called cells), 

determining displacements of these cells, and comparing the cell displacements to the overall 
displacement. Figure 44B depicts the image represented in Figure 44A and shows a 128 x 128 
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pixel portion 1090 of the image, made up of 16 individual 32 x 32 pixel validation cells 1092, 
from which data is used to validate the misalignment correction 

[0460] Figure 45, Figures 46A and B, and Figures 47A and B depict steps in illustrative 
embodiment methods of determining a misalignment correction between two images of a 

5 sequence, and methods of validating that determination. Steps 1 096 and 1 098 of Figure 45 show 
development of data from an initial image with which data from a subsequent image are 
compared in order to determine a misalignment correction between the subsequent image and the 
initial image. An initial image "o" is preprocessed, then filtered to obtain a matrix of values, for 
example, optical luminance (brightness, intensity), representing a portion of the initial image. In 

10 one embodiment, preprocessing comprises transforming the three RGB color components 

corresponding to a given pixel into a single luminance value. An exemplary luminance is CCIR 
601, shown in Equation 63. CCIR 601 luminance may be used, for example, as a measure of the 
"whiteness" of a particular pixel in an image from an acetowhitening test. Different expressions 
for grayscale luminance may be used, and the choice may be geared to the specific type of 

15 diagnostic test conducted. The details of step 1096 of Figure 45 is illustrated in blocks 1 130, 
1 132, and 1 134 of Figure 46A, where block 1130 represents the initial color image, "o", in the 
sequence, block 1132 represents conversion of color data to grayscale using Equation'63, and 
block 1 134 represents the image of block 240 after conversion to grayscale. Referring now to 
Figures 46A and 46B, Figure 46B is a continuation of Figure 46A, linked, for example, by the 

20 circled connectors labeled A and B. Accordingly, going forward, Figures 46A and 46B are 
referred to as Figure 46. 

[0461] Step 1098 of Figure 45 represents filtering a 256 x 256 portion of the initial image, for 
example, a portion analogous to the 256 x 256 central portion 1088 of the image 1086 of Figure 
44A, using Laplacian of Gaussian filtering. Other filtering techniques are used in other 

25 embodiments. Preferred embodiments employ Laplacian of Gaussian filtering, which combines 
the Laplacian second derivative approximation with the Gaussian smoothing filter to reduce the 
high frequency noise components prior to differentiation. This filtering step may be performed 
by discrete convolution in the space domain, or by frequency domain filtering. The Laplacian of 
Gaussian (LoG) filter may be expressed in terms of x and y coordinates (centered on zero) as 

30 shown in Equation 77: 




(77) 
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where x and y are space coordinates and o is the Gaussian standard deviation. In one preferred 
embodiment, an approximation to the LoG function is used. Illustrative embodiments described 
herein include use of an approximation kernel(s) of size 9 x 9, 21 x 21 , and/or 31x31. The 
Gaussian standard deviation, a, is chosen in certain preferred embodiments using Equation 78: 
5 a = LoG filter size / 8.49 (78) 

where LoG filter size corresponds to the size of the discrete kernel approximation to the LoG 
function (i.e. 9, 21 , and 3 1 for the approximation kernels used herein). Other embodiments 
employ different kernel approximations and/or different values of Gaussian standard deviation. 
[0462] The LoG filter size may be chosen so that invalid scans are failed and valid scans are 
10 passed with a minimum of error. Generally, use of a larger filter size is better at reducing large 
structured noise and is more sensitive to larger image features and larger motion, while use of a 
smaller filter size is more sensitive to smaller features and smaller motion. One embodiment of 
the invention comprises adjusting filter size to coordinate with the kind of motion being tracked 
and the features being imaged. 
15 [0463] The details of step 1098 of Figure 45 is illustrated in Figure 46 in blocks 1 134, 1 136, 
and 1 138 where block 1 134 represents data from the initial image in the sequence after 
conversion to grayscale luminance, block 1 136 represents the application of the LoG filter, and 
block 1138 represents the 256 x 256 matrix of data values, G 0 (x,y), which is the "gold standard" 
by which other images are compared in validating misalignment correction determinations in this 
20 embodiment. As detailed in Figures 47 A and 47B, one embodiment validates a misalignment 
correction determination by comparing a given image to its preceding image in the sequence, not 
by comparing a given image to the initial image in the sequence as shown in Figure 46. 
(Referring now to Figures 47A and 47B, Figure 47B is a continuation of Figure 47 A, linked, for 
example, by the circled connectors labeled A, B, and C. Accordingly, going forward, Figures 
25 47A and 47B are referred to as Figure 47.) Although Figure 45, Figure 46, and Figure 47 show 
application of the LoG filter as a discrete convolution in the space domain, resulting in a 
standard expressed in space coordinates, other embodiments comprise applying the LoG filter in 
the frequency domain. In either case, the LoG filter is preferably zero padded to the image size. 
[0464] The details of steps 1 100 and 1 102 of Figure 45 represent preprocessing an image "i" 
30 by converting RGB values to grayscale luminance as discussed above, and performing LoG 
filtering to obtain Gi(x,y), a matrix of values from image "i" which is compared with that of 
another image in the sequence in order to determine a misalignment correction between the two 
images. The details of steps 1 100 and 1 102 of Figure 45 are illustrated in Figure 46 in blocks 
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1 140, 1 142, 1 144, 1 146, and 1 148, where fi(x,y) in block 1 140 is the raw image data from image 
"i", block 1 142 represents conversion of the fj(x,y) data to gray scale intensities as shown in 
block 1 144, and block 1 146 represents application of the LoG filter on the data of block 1 144 to 
produce the data of block 1 148, Gi(x,y). 

5 [0465] Similarly, steps 1 106 and 1 108 of Figure 45 represent preprocessing an image "j" by 
converting RGB values to grayscale luminance as discussed above, and performing LoG filtering 
to obtain Gj(x,y), a matrix of values from image "j" which is compared with image "i" in order to 
determine a measure of misalignment between the two images. In some preferred embodiments, 
image "j" is subsequent to image "i" in the sequence. In some preferred embodiments, "i" and 

10 "j" are consecutive images. Steps 1 1 06 and 1 1 08 of Figure 45 are illustrated in Figure 46 in 

blocks 1154, 1156, 1158, 1160, and 1 162, where "j" is "i+1", the image consecutive to image "i" 
in the sequence. In Figure 46, block 1 154 is the raw "i+1" image data, block 1 1 56 represents 
conversion of the "i+1" data to gray scale intensities as shown in block 1158, and block 1 160 
represents application of the LoG filter on the data of block 1 158 to produce the data of block 

15 1162,G w (x,y). 

[0466] Steps 1104 and 1110 of Figure 45 represent applying a Fourier transform, for example, 
a Fast Fourier Transform (FFT), using Gj(x,y) and Gj(x,y), respectively, to obtain Fi(u,v) and 
Fj(u,v), which are matrices of values in the frequency domain corresponding to data from images 
"i" and "j", respectively. Details of steps 1 104 and 1 1 10 of Figure 45 are illustrated in Figure 46 

20 by blocks 1148, 1 150, 1 152, 1 162, 1 164, and 1 166, where "j" is "i+1", the image consecutive to 
image "i" in the sequence. In Figure 46, block 1 148 represents the LoG filtered data, Gj(x,y), 
corresponding to image "i", and block 1150 represents taking the Fast Fourier Transform of 
Gi(x,y) to obtain F,(u,v), shown in block 1 152. Similarly, in Figure 46 block 1 162 is the LoG 
filtered data, G i+ i(x,y), corresponding to image "i+1", and block 1 164 represents taking the Fast 

25 Fourier Transform of G i+ i(x,y) to obtain F i+ i(u,v), shown in block 1 166. 

[0467] Step 1 1 12 of Figure 45 represents computing the cross correlation Fj(u,v) F j(u,v), 
where Fj(u,v) is the Fourier transform of data from image "i", F*j(u,v) is the complex conjugate 
of the Fourier transform of data from image "j", and u and v are frequency domain variables. 
The cross-correlation of two signals of length Ni andN 2 provides Ni+N 2 -1 values; thus avoiding 

30 aliasing problems due to under-sampling, the two signals should be padded with zeros up to 
N!+N 2 -l samples. Details of step 1 1 12 of Figure 45 are represented in Figure 46 by blocks 
1 152, 1 166, and 1168. Block 1 168 of Figure 46 represents computing the cross correlation, 
Fi(u,v) F* i+ i(u,v), using Fi(u,v), the Fourier transform of data from image "i", and F* i+ i(u,v), the 
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complex conjugate of the Fourier transform of data from image "i+1". The cross-correlation 
may also be expressed as c(k,l) in Equation 79: 

c(A,/) = 2 I Ii(p,q)h(P-k,q-l) C 79 ) 
where variables (fc,/) can be thought of as the shifts in each of the x- and y-directions which are 
5 being tested in a variety of combinations to determine the best measure of misalignment between 
two images I\ and h, and where p and q are matrix element markers. 

[0468] Step 1 1 14 of Figure 45 represents computing the inverse Fourier transform of the cross- 
correlation computed in step 1 1 12. Step 1 1 14 of Figure 45 is represented in Figure 46 by block 
1 170. The resulting inverse Fourier transform maps how well the 256 x 256 portions of images 

10 "i" and "j" match up with each other given various combinations of x- and y-shifts. Generally, 
the normalized correlation coefficient closest to 1.0 corresponds to the x-shift and y-shift 
position providing the best match, and is determined from the resulting inverse Fourier 
transform. In a preferred embodiment, correlation coefficients are normalized by dividing 
matrix values by a scalar computed as the product of the square root of the (0,0) value of the 

15 awto-correlation of each image. In this way, variations in overall brightness between the two 
images have a more limited effect on the correlation coefficient, so that the actual movement 
within the image frame field between the two images is better reflected in the misalignment 
determination. 

[0469] Step 1 1 1 6 of Figure 45 represents determining misalignment values d x , d y , d, sum(d x ), 
20 sum(d y ), and Sum(dj), where d x is the computed displacement between the two images "i" and 
"j" in the x-direction, d y is the computed displacement between the two images in the y- 
direction, d is the square root of the sum d x 2 +d y 2 and represents an overall displacement between 
the two images, suntfdx) is the cumulative x-displacement between the current image "j" and the 
first image in the sequence "o", sum(d y ) is the cumulative y-displacement between the current 
25 image "j" and the first image in the sequence "o", and Sum(dj) is the cumulative displacement, d, 
between the current image "j" and the first image in the sequence "o". Step 1116 of Figure 45 is 
represented in Figure 46 by blocks 1 172, 1 174, and 1 176. Blocks 1 174 and 1 176 represent 
finding the maximum value in the data of block 1 172 in order to calculate d x , d y , d, sum(d x ), 
sum(d y ), and Sum(d i+ 0 as described above, where image "j" in Figure 45 is "i+1" in Figure 46, 
30 the image consecutive to image "i". For example, in the scan illustrated by block 732 in Figure 
27B, if image "i" is the image at block 750, then image "j" is the next consecutive image (the 
image at block 752). 
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[0470] Steps 1 1 1 8, 1 1 20, and 1 1 22 of Figure 45 represent one method of validating the 
misalignment correction determined for image "j" in step 1 1 16 of Figure 45. This method of 
validating misalignment correction is represented in blocks 1 177, 1 179, 1 181, 1 190, 1 192, and 
1 194 of Figure 47. Another method of validating a misalignment correction is represented in 
5 steps 1 124, 1 126, and 1 128 of Figure 45; and this method is represented in blocks 1 1 78, 1 1 80, 
1 182, 1 1 84, 1 1 86, and 1 1 88 of Figure 46. Figure 47 is a schematic flow diagram depicting steps 
in a version of the methods shown in Figure 45 of determining a correction for a misalignment 
between two images in which validation is performed using data from two consecutive images. 
One embodiment includes using consecutive or near-consecutive images to validate a 
10 misalignment correction determination, as in Figure 47. Other embodiments comprise using the 
initial image to validate a misalignment correction determination for a given image, as in Figure 
46. 

[0471] In Figure 45, step 1118 represents realigning Gj(x,y), the LoG-filtered data from image 
"j", to match up with Gi(x,y), the LoG-filtered data from image "i" using the misalignment 

15 values d x and d y determined in step 1116. In preferred embodiments, image "j" is consecutive to 
image "i" in the sequence of images. Here, image "j" is image "i+1" such that Gi(x,y) is aligned 
with Gi+i(x,y) as shown in block 1 177 of Figure 47. Similarly, in Figure 45, step 1 124 represents 
realigning Gj(x,y), the LoG-filtered data from image "j", to match up with G 0 (x,y), the LoG- 
filtered "gold standard" data from the initial image "o", using the displacement values sum(d x ) 

20 and sum(dy) determined in step 1116. Step 1 1 24 of Figure 45 is represented in block 1 1 78 of 
Figure 46. 

[0472] Step 1 120 of Figure 45 represents comparing corresponding validation cells from 
Gj(x,y) and Gi(x,y) by computing correlation coefficients for each cell. This is represented 
schematically in Figure 47 by blocks 1179, 1181, 1190, 1192, and 1194 for the case wherej = 

25 i+1. First, a 128 x 128 pixel central portion of the realigned G i+ i(x,y) is selected, and the 

corresponding 128 x 128 pixel central portion of Gi(x,y) is selected, as shown in blocks 1 179 and 
1181 of Figure 47. An exemplary 128 x 128 pixel validation region 1090 is shown in Figure 
44B. Then, one embodiment comprises computing a correlation coefficient for each of 16 
validation cells. An exemplary validation cell from each of the realigned G i+ i(x,y) matrix 1181 

30 and Gi(x,y) matrix 1 179 is shown in blocks 1 192 and 1 190 of Figure 47. The validation cells are 
as depicted in the 32 x 32 pixel divisions 1092 of the 128 x 128 pixel validation region 1090 of 
Figure 44B. Different embodiments use different numbers and/or different sizes of validation 
cells. Correlation coefficients are computed for each of the 1 6 cells, as shown in block 1 194 of 
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Figure 47. Each correlation coefficient is a normalized cross-correlation coefficient as shown in 
Equation 82: 



where c\mji) is the normalized cross-correlation coefficient for the validation cell (m,n), m is an 
5 integer 1 to 4 corresponding to the column of the validation cell whose correlation coefficient is 



correlation coefficient is being calculated, p and q are matrix element markers, Ii[p,q] are 
elements of the cell in column m and row n of the 128 x 128 portion of the realigned image 
shown in block 1181 of Figure 47, and kfoq] are elements of the cell in column m and row n of 

10 the 128 x 128 portion of Gi(x,y) shown in block 1 179 of Figure 47. In that figure, p = 1 to 32 
and q = 1 to 32, and the sums shown in Equation 82 are performed over p and q. The cross- 
correlation coefficient of Equation 82 is similar to an auto-correlation in the sense that a 
subsequent image is realigned with a prior image based on the determined misalignment 
correction so that, ideally, the aligned images appear to be identical. A low value of c'(m,n) 

15 indicates a mismatching between two corresponding cells., The misalignment correction 
determination is then either validated or rejected based on the values of the 16 correlation 
coefficients computed in step 1 194 of Figure 47. For example, each correlation coefficient may 
be compared against a threshold maximum value. This corresponds to step 1 122 of Figure 45. 
[0473] Step 1 126 of Figure 45 represents comparing corresponding validation cells from 

20 Gj(x,y) and G 0 (x,y) by computing correlation coefficients for each cell. This is represented 
schematically in Figure 46 by blocks 1 1 80, 1 1 82, 1 1 84, 1 186, and 1 1 88 for the case where j = 
i+1 . First, a 128 x 128 pixel central portion of the realigned Gi+i(x,y) is selected, and the 
corresponding 128 x 128 pixel central portion of G 0 (x,y) is selected, as shown in blocks 1 182 
and 1180 of Figure 46. An exemplary 128 x 128 pixel validation region 1090 is shown in Figure 

25 44B. Then, one embodiment comprises computing a correlation coefficient for each of the 16 
validation cells. An exemplary validation cell from each of the realigned Gi+i(x,y) matrix 1 1 82 
and G 0 (x,y) matrix 1 1 80 is shown in blocks 1186 and 1 1 84 of Figure 46. The validation cells are 
as depicted in the 32 x 32 pixel divisions 1092 of the 128 x 128 pixel validation region 1090 of 
Figure 44B. Other embodiments use different numbers of and/or different sizes of validation 

30 cells. Correlation coefficients are computed for each of the 16 cells, as shown in block 1 188 of 
Figure 46. Each correlation coefficient is a normalized "autocorrelation coefficient as shown 
in Equation 80 above, where Ii[p,q] are elements of the cell in column m and row n of the 128 x 




being calculated, n is an integer 1 to 4 corresponding to the row of the validation cell whose 
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128 portion of the realigned subsequent image shown in block 1 1 82 of Figure 46, and h\p,q\ are 
elements of the cell in column m and row n of the 128 x 128 portion of G 0 (x,y) shown in block 
1 1 80 of Figure 46. A low value of c'(m,n) indicates a mismatching between two corresponding 
cells. The misalignment determination is then either validated or rejected based on the values of 
5 the 1 6 correlation coefficients computed in step 1 1 88 of Figure 46. This corresponds to step 
1128 of Figure 45. 

[0474] In one embodiment, determinations of misalignment correction and validation of these 
determinations as shown in each of Figure 45, Figure 46, and Figure 47 are performed using a 
plurality of the images in sequence. In one embodiment, determinations of misalignment 

10 correction and validations thereof are performed while images are being obtained, so that an 
examination in which a given sequence of images is obtained may be aborted before all the 
images are obtained. In some embodiments, a misaUgnment correction is determined, validated, 
and compensated for by adjusting the optical signal detection device obtaining the images. In 
certain embodiments, an adjustment of the optical signal detection device is made after each of a 

15 plurality of images are obtained. In certain embodiments, an adjustment, if required by the 
misalignment correction determination, is made after every image subsequent to the first image 
(except the last image), and prior to the next consecutive image. In one embodiment, a cervical 
tissue scan comprising a sequence of 13 images is performed using on-the-fly misalignment 
correction determination, validation, and camera adjustment, such that the scan is completed in 

20 about 1 2 seconds. Other embodiments comprise obtaining sequences of any number of images 
in more or less time than indicated here. 

[0475] Each of steps 1 122 and 1 128 of the embodiment of Figure 45 represents applying a 
validation algorithm to determine at least the following: (1) whether the misalignment correction 
can be made, for example, by adjusting the optical signal detection device, and (2) whether the 

25 misalignment correction determined is valid. In an exemplary embodiment, the validation 
algorithm determines that a misalignment correction cannot be executed during an 
acetowhitening exam conducted on cervical tissue in time to provide sufficiently aligned 
subsequent images, if either of conditions (a) or (b) is met, as follows: (a) d„ the displacement 
between the current image "i" and the immediately preceding image "i-1" is greater than 0.55- 

30 mm or (b) Sum(dO, the total displacement between the current image and the first image in the 
sequence, V, is greater than 2.5-mm. If either of these conditions is met, the spectral scan in 
progress is aborted, and another scan must be performed. If sufficient time remains within the 
optimal time window for obtaining spectral data, a fresh scan may begin immediately after a 
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previous scan is aborted. Other embodiments may comprise the use of different validation rules. 
In one embodiment, if only condition (a) is met, the system retakes image "i" while continuing 
the spectral scan, and if condition (b) is met, the spectral scan is aborted and must be restarted if 
sufficient time remains within the optimal window. 
5 [0476] In one embodiment, validation is performed for each determination of misalignment 
correction by counting how many of the correlation coefficients c' r (m,n) shown in Equation 80 
(corresponding to the 16 validation cells) is less than 0.5. If this number is greater than 1, the 
scan in progress is aborted. In one embodiment, if there are more than three correlation 
coefficients c\(m,n) less than 0.35, then the scan is aborted. Other embodiments comprise the 

10 use of different validation rules. Gradual changes in image features, such as acetowhitening of 
tissue or changes in glare, cause discrepancies which are reflected in the correlation coefficients 
of the validation cells, but which do not represent a spatial shift. Thus, in preferred 
embodiments, the validation is performed as shown in Figure 47, where validation cells of 
consecutive images are used to calculate the correlation coefficients. In other embodiments, the 

15 validation is performed as shown in Figure 46, where validation cells of a current image, "i", and 
an initial image of the sequence, "o", are used to calculate the correlation coefficients of 
Equation 80. 

[04771 Figures 48A-F depict a subset of adjusted, filtered images 1200, 1204, 1208, 1212, 
1216, and 1220 from a sequence of images of a tissue with an overlay of gridlines showing the 

20 validation cells used in validating the determinations of misalignment correction between the 
images, according to an illustrative embodiment of the invention. By performing validation 
according to Figure 47, using consecutive images to calculate the correlation coefficients of 
Equation 80, the number of validation cells with correlation coefficient below 0.5 for the 
misalignment-corrected images of Figure 48A-F is 0, 1, 0, 0, and 1 for images 1204, 1208, 1212, 

25 1216, and 1220, respectively. Since none of the images have more than one coefficient below 
0.5, this sequence is successful and is not aborted. There is only a gradually changing glare, seen 
to move within the validation region 1202, 1206, 1210, 1214, 1218, 1222 of each image. In an 
embodiment in which validation is performed as in Figure 46, the number of validation cells 
with correlation coefficient below 0.5 for the misalignment-corrected images of Figure 48A-F is 

30 3, 4, 5, 5, and 6 for images 1204, 1208, 1212, 1216, and 1220, respectively. This is not a good 
result in this example, since the exam would be erroneously aborted, due only to gradual changes 
in glare or whitening of tissue, not uncompensated movement of the tissue sample. 
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[0478] Alternatively, validation cells that are featureless or have low signal-to-noise ratio are 
eliminated from consideration. Those cells can produce meaningless correlation coefficients. 
Featureless cells in a preferred embodiment are identified and eliminated from consideration by 
examining the deviation of the sum squared gradient of a given validation cell from the mean of 
5 the sum squared gradient of all cells as shown in Equation 81 : 

IF ssgi(m,n) < Mean[ssg(m,n)] - STD[ssg(m,n)], THEN set c'i(m,n) = 1.0. (81) 

where c'i(m,n) is the correlation of the given validation cell "1", ssgi(m,n) = E E Ii 2 [p,q], m = 1 
to 4, n = 1 to 4, Ii[p,q] is the matrix of values of the given validation cell "1", p = 1 to 32, q = 1 
to 32, the summations EE are performed over pixel markers p and q, Mean[ssg(m,n)] is the 

10 mean of the sum squared gradient of all 1 6 validation cells, and STD[ssg(m,n)] is the standard 
deviation of the sum squared gradient of the given validation cell "1" from the mean sum 
squared gradient. By setting c'i(m,n) = 1.0 for the given validation cell, the cell does not count 
against validation of the misalignment correction determination in the rubrics of either step 1 122 
or step 1 128 of Figure 45, since a correlation coefficient of 1.0 represents a perfect match. 

15 [0479] If an image has large intensity differences between the upper and lower borders and/or 
the left and right borders of the image frame field, LoG filtering may result in "wraparound 
error." A preferred embodiment employs an image blending technique such as "feathering" to 
smooth border discontinuities, while requiring only a minimal amount of additional processing 
time. 

20 [0480] Figure 49A depicts a sample image 1224 after application of a 9-pixel size [9 x 9] 

Laplacian of Gaussian filter (LoG 9 filter) on an exemplary image from a sequence of images of 
tissue, according to an illustrative embodiment of the invention. The filtered intensity values are 
erroneous at the top edge 1226, the bottom edge 1228, the right edge 1232, and the left edge 
1230 of the image 1224. Since LoG frequency domain filtering corresponds to cyclic 

25 convolution in the space-time domain, intensity discontinuities between the top and bottom 
edges of an image and between the right and left edges of an image result in erroneous gradient 
approximations. These erroneous gradient approximations can be seen in the dark stripe on the 
right edge 1232 and bottom edge 1228 of the image 1224, as well as the light stripe on the top 
edge 1226 and the left edge 1230 of the image 1224. This often results in a misalignment 

30 correction determination that is too small, since changes between the images due to spatial shift 
are dwarfed by the edge effects. A preferred embodiment uses a "feathering" technique to 
smooth border discontinuities and reduce "wraparound error." 
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[0481] Feathering comprises removal of border discontinuities prior to application of a filter. 
In preferred embodiments, feathering is performed on an image before LoG filtering, for 
example, between steps 1 1 00 and 1 102 in Figure 45. In embodiments where LoG filtering is 
performed in the frequency domain (subsequent to Fourier transformation), feathering is 
preferably performed prior to both Fourier transformation and LoG filtering. For two- 
dimensional image intensity Ouminance) functions Ii(x,y) and I 2 (x,y) that are discontinuous at x 
= x 0 , an illustrative feathering algorithm is as follows: 

/ 1 '(^y) = / 1 (^y)-/( £ ^ L + 0.5) and r 2 (ac.y) = / a (x.y)-a-/( i ^+05)). 



/(*) = 



0 x<0 

3x 2 -2jc 3 0<x^1 , (82) 



0 jc>1 

where Ii'(x,y) and li%y) are the intensity (luminance) functions Ii(x,y) and I 2 (x,y) after 
applying the feathering algorithm of Equation 82, and d is the feathering distance chosen. The 
feathering distance, d, adjusts the tradeoff between removing wraparound error and suppressing 
image content. 

[0482] Figure 49B depicts the application of both a feathering technique and a LoG filter on 
the same unfiltered image used in Figure 49A. The feathering is performed to account for border 
processing effects, according to an illustrative embodiment of the invention. Here, a feathering 
distance, d, of 20 pixels was used. Other embodiments use other values of d. The filtered image 
1234 of Figure 49B does not display uncharacteristically large or small gradient intensity values 
at the top edge 1236, bottom edge 1238, right edge 1242, or left edge 1240, since discontinuities 
are smoothed prior to LoG filtering. Also, there is minimal contrast suppression of image detail 
at the borders. Pixels outside the feathering distance, d, are not affected. The use of feathering 
here results in more accurate determinations of misalignment correction between two images in a 
sequence of images. 

[0483] Another method of border smoothing is multiplication of unfiltered image data by a 
Hamming window. In some embodiments, a Hamming window function is multiplied to image 
data before Fourier transformation so that the border pixels are gradually modified to remove 
discontinuities. However, application of the Hamming window suppresses image intensity as 
well as gradient information near the border of an image. 

[0484] Figure 50A is identical to Figure 49A and depicts the application of a LoG 9 filter on 
an exemplary image from a sequence of images of tissue according to an illustrative embodiment 
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of the invention. The filtered intensity values are erroneous at the top edge 1226, the bottom 
edge 1228, the right edge 1232, and the left edge 1230 of the image 1224. 
[0485] Figure SOB depicts the application of both a Hamming window and a LoG 9 filter on 
the same unfiltered image used in Figure 50A. Hamming windowing is performed to account for 
5 border processing effects, according to an illustrative embodiment of the invention. Each of the 
edges 1246, 1248, 1250, 1252 of the image 1244 of Figure SOB no longer show the extreme 
filtered intensity values seen at the edges 1226, 1228, 1230, 1232 of the image 1224 of Figure 
50A. However, there is a greater suppression of image detail in Figure 50B than in Figure 49B. 
Thus, for this particular embodiment, application of the feathering technique is preferred over 

1 0 application of Hamming windowing. 

[0486] One embodiment includes removing cyclic convolution artifacts by zero padding the 
image prior to frequency domain filtering to assure image data at an edge would not affect 
filtering output at the opposite edge. This technique adds computational complexity and may 
increase processing time. 

15 [0487] Figures 51 A-F depict the determination of a misalignment correction between two 
images using methods including the application of LoG filters of various sizes, as well as the 
application of a Hamming window technique and a feathering technique, according to illustrative 
embodiments of the invention. Image 1254 and image 1256 of Figures 51A-B are consecutive 
images from a sequence of images of cervix tissue obtained during a diagnostic exam, each with 

20 a pixel resolution of about 0.054-mm. Figures 51C-F depict the application of four different 
image filtering algorithms: (1) Hamming window with LoG 9 filtering, (2) feathering with LoG 
9 filtering, (3) feathering with LoG 21 filtering, and (4) feathering with LoG 31 filtering. Each 
of these algorithms are implemented as part of a misalignment correction determination and 
validation technique as illustrated in Figure 45 and Figure 47, and values of d x and d y between 

25 images 1254 and 1256 of Figures 5 1 A-B are determined using each of the four filtering 

algorithms. For image 1254, each of the four different image filtering algorithms (1) - (4) listed 
above are applied, resulting in images 1258, 1262, 1266, and 1270, respectively, each having 
256 x 256 pixels. The four different image filtering algorithms are also applied for image 1256, 
resulting in images 1260, 1264, 1268, and 1272, respectively, each having 256 x 256 pixels. 

30 Values of (d x , d y ) determined using Hamming + LoG 9 filtering are (-7, 0), expressed in pixels. 
Values of (d x , d y ) determined using feathering 4- LoG 9 filtering are (-2, -10). Values of (d x , d y ) 
determined using feathering + LoG 21 filtering are (-1, -9). Values of (d^ d y ) determined using 
feathering + LoG 31 filtering are (0, -8). All of the displacement values determined using 
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feathering are close in this embodiment, and agree well with visually-verified displacement. 
However, in this example, the displacement values determined using Hamming windowing are 
different from those obtained using the other three filtering methods, and result in a 
misalignment correction that does not agree well with visually-verified displacement Thus, for 

5 this example, feathering works best since it does not suppress as much useful image data. 

[0488] The effect of the filtering algorithm employed, as well as the choice of validation rules 
are examined by applying combinations of the various filtering algorithms and validation rules to 
pairs of sequential images of tissue and determining the number of "true positives 5 ' and "false 
positives" identified. A true positive occurs when a bad misalignment correction determination 

10 is properly rejected by a given validation rule. A false positive occurs when a good 

misalignment correction determination is improperly rejected as a failure by a given validation 
rule. The classification of a validation result as a "true positive" or a "false positive" is made by 
visual inspection of the pair of sequential images. In preferred embodiments, whenever true 
failures occur, the scan should be aborted. Some examples of situations where true failures 

15 occur in certain embodiments include image pairs between which there is one or more of the 
following: a large non-translational deformation such as warping or tilting; a large jump for 
which motion tracking cannot compute a correct translational displacement; rotation greater than 
about 3 degrees; situations in which a target laser is left on; video system failure such as blur, 
dark scan lines, or frame shifting; cases where the image is too dark and noisy, in shadow; cases 

20 where a vaginal speculum (or other obstruction) blocks about half the image; other obstructions 
such as sudden bleeding. 

[0489] In one embodiment, a set of validation rules is chosen such that true positives are 
maximized and false positives are minimized. Sensitivity and specificity can be adjusted by 
adjusting choice of filtering algorithms and/or choice of validation rules. Table 4 shows the 

25 number of true positives (true failures) and false positives (false failures) determined by a 
validation rule as depicted in Figure 45 and Figure 47 where validation is determined using 
consecutive images. Table 4 shows various combinations of filtering algorithms and validation 
rules. The four filtering algorithms used are (1) Hamming windowing with LoG 9 filtering, (2) 
feathering with LoG 9 filtering, (3) feathering with LoG 21 filtering, and (4) feathering with LoG 

30 31 filtering. The values, c'(m,n), correspond to the normalized "autocorrelation coefficient of 
Equation 80 whose value must be met or exceeded in order for a validation cell to "pass" in an 
embodiment. The "Number Threshold" column indicates the maximum number of "failed" 
validation cells, out of the 16 total cells, that are allowed for a misalignment correction 
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determination to be accepted in an embodiment. If more than this number of validation cells fail, 
then the misalignment correction determination is rejected. 



Table 4: True positives and false positives of validation determinations for embodiments using various 
5 combinations of filtering algorithms and validation rules. 





c'(m^i) 


Number 
Threshold 


TP 


FP 


Hamming LoG 9 


-0.1 


1 


34 


28 


Feathering LoG 9 


-0.1 


3 


19 


17 


Feathering LoG 21 


0.3 


2 


46 


10 


0.35 


3 


52 


4 


Feathering LoG 31 


0.5 


3 


48 


3 



[0490] For the given set of cervical image pairs on which the methods shown in Table 4 were 
applied, feathering performs better than Hamming windowing, since there are more true 

10 positives and fewer false positives. Among different LoG filter sizes, LoG 21 and LoG 3 1 
performs better than LoG 9 for both tracking and validation here. The LoG 21 filter is more 
sensitive to rotation and deformation than the LoG 3 1 filter for these examples. One 
embodiment of the determination and validation of misalignment corrections between 256 x 256 
pixel portions of images of cervical tissue with pixel resolution of about 0.054-mm employs one 

15 or more of the following: (1 ) use of feathering for image border processing, (2) application of 
LoG 21 filter, (3) elimination of validation cells with low signal-to-noise ratio, and (4) use of 
consecutive images for validation. 

Broadband reflectance arbitration and low-signal masking 
[0491] A tissue characterization system as shown in Figure 1 also may comprise arbitrating 

20 between two or more redundant sets of spectral data as depicted in step 128 of Figure 1 . In one 
embodiment shown in Figure 1, step 128 includes arbitrating between two sets of broadband 
reflectance data obtained in step 104 during a spectral scan for each interrogation point of a 
tissue sample. Data are obtained at each interrogation point using light incident to the 
interrogation point at two different angles, as depicted in Figure 8. In this way, if only one set of 

25 reflectance data is affected by an artifact such as glare or shadow, the other set can be used in 
tissue classification, for example, in step 132 of Figure 1 . The arbitration step 128 in Figure 1 
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determines whether either of the two sets of reflectance spectral data at each point is affected by 
an artifact. Step 128 also determines a single set of reflectance data from each interrogation 
point to be used in tissue classification if at least one of the two sets is acceptably unaffected by 
an artifact. As used here, artifacts identified in the arbitration step 128 of Figure 1 include, for 
5 example, both lighting artifacts and obstruction artifacts - such as glare, shadow, blood, mucus, a 
speculum, smoke tube tissue, and/or os tissue. 

[0492] In the embodiment shown in Figure 1, step 128 additionally includes a first-level "hard 
masking" of certain interrogation points. For example, interrogation points are considered 
"indeterminate" where values of both sets of reflectance spectral data and/or values of the set of 

10 fluorescence data are low due to shadow or an obstruction. Additional spectral masks, both hard 
masks and soft masks, are determined in one embodiment in step 130 of Figure 1. As discussed 
herein, hard-masking of data includes eliminating identified, potentially non-representative data 
from further consideration and identifying the corresponding tissue region as "indeterminate", 
while soft-masking includes applying a weighting function or weighting factor to identified, 

15 potentially non-representative data so that the importance of the data as a diagnostic indicator of 
a tissue region in a tissue classification algorithm is thereby reduced. A point that is soft-masked 
is not necessarily identified as "indeterminate". 

[0493] The diagram 284 of Figure 8 shows that a misalignment of the probe 142 may create 
conditions where either or both of the top and bottom speculum blades 286 block part or all of 

20 the illumination path from either or both of the intersecting upper and lower cones of 

illuminating light 196,198, thereby affecting the spectral data obtained for the region 250 of the 
tissue sample 194. The speculum blades, or other obstructions present during a spectral scan, 
may physically obstruct the region 250 being analyzed, or may partially obstruct the light 
illuminating the region 250 causing a shadow. In either case, the spectral data obtained may be 

25 adversely affected and rendered unusable for characterizing the region of the tissue sample. 

Obtaining multiple sets of spectral data using illumination from sources at various positions and 
angles improves the chances of obtaining at least one set of spectral data that is not affected by 
glare, shadow, and/or obstructions. 

[0494] Figure 52 shows a graph 1276 depicting exemplary mean values of reflectance spectral 
30 data 1278 as a function of wavelength 1280 for tissue regions affected by glare 1282, tissue 
regions affected by shadow 1284, and tissue regions affected by neither glare nor shadow 1286 
according to an illustrative embodiment of the invention. The reflectance spectral data 1278 
represent the fraction of incident light that is reflected from the sample. The graph 1276 shows 
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that the reflectance values of a region of tissue affected by glare 1282 are higher at all measured 
wavelengths than the reflectance of a region of tissue not affected by glare 1286. The graph 
1276 also shows that the reflectance values of a region of tissue with illumination partially 
blocked by a speculum blade such that the region is in shadow 1284, are lower at all measured 
5 wavelengths than the reflectance of a region of tissue not affected by shadow 1286. The shapes 
of all three curves 1282, 1284, 1286 are different. In this example, the data affected by glare or 
shadow may not be usable to determine a condition or characteristic of the region of the sample, 
if the data are not representative of the region of the tissue sample. Hence, glare and shadow 
may adversely affect spectral data obtained for a region of a tissue sample. 

10 [0495] In one embodiment, step 104 of Figure 1 comprises obtaining one fluorescence 
spectrum and two broadband reflectance spectra at each of a plurality of scan locations of the 
sample tissue (interrogation points). Here, a spectrum refers to a collection of spectral data over 
a range of wavelengths. In one embodiment method, spectral data are collected over a range of 
wavelengths between 360 and 720 nm in 1 nm increments. In other embodiments, the range of 

15 wavelengths lies anywhere between about 190nm and 1 lOOnm. Here, the two reflectance spectra 
are referred to as the BB1 (broadband one) and BB2 (broadband two) spectra. BB1 and BB2 
differ in the way that the tissue is illuminated at the time the spectral data are obtained as 
described below. In the embodiment shown in Figure 6, the probe head 192 has 4 illumination 
sources 222, 224, 226, 228 located circumferentially about the collection optics 200. Two 

20 sources are above 222, 224 and two are below the horizontal plane 226, 228, as illustrated in the 
second arrangement 212 of Figure 6. The two upper sources are used to obtain BB1 spectra and 
the two lower sources are used to obtain BB2 spectra. Since the upper and lower sources 
illuminate a region of the tissue sample using light incident to the region at different angles, an 
artifact - for example, or shadow - may affect one of the two reflectance spectra obtained for the 

25 region, while the other reflectance spectrum is unaffected. For example, during acquisition of 
spectral data, the BB1 spectrum may be unaffected by an artifact even if the BB2 spectrum is 
adversely affected by the artifact. In such a case, BB1 spectral data may be used to characterize 
the condition of the region of tissue, for example, in step 132 of Figure 1, even though the BB2 
data is not representative of the region. In other embodiments, the BB1 and BB2 spectra 

30 comprise one or more other types of spectral data, such as absorbance spectra, adsorption 
spectra, transmission spectra, fluorescence spectra, and/or other types of optical and atomic 
emission spectra. 
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[0496] Figure 53 shows a graph 1287 depicting mean values and standard deviations of 
broadband reflectance spectral data using the BB1 channel light source for regions confirmed as 
being obscured by blood, obscured by mucus, obscured by glare from the BB1 source, obscured 
by glare from the BB2 source, or unobscured, according to an illustrative embodiment of the 

5 invention. Various sample test points corresponding to regions of tissue from patient scans were 
visually identified as having blood, mucus, or glare present. A sample point was identified as 
having blood present if it was completely covered by blood and if there was no glare. A sample 
point was identified as having mucus present if it was completely covered by mucus and if there 
was no glare. A sample point was identified as having glare based on visual evidence of glare 

10 and large reflectance values in at least one of the two sets of reflectance spectral data (the BB1 
spectrum or the BB2 spectrum). Figure 53 shows the range of BB1 reflectance values 1288 for a 
given category of the sample test points which lie within one standard deviation of the mean for 
the category, plotted as a function of wavelength 1290. Figure 53 shows ranges of BB1 
reflectance values 1288 for each of the following categories of sample test points: those 

15 identified as having blood present 1292, those identified as having mucus present 1294, those 
identified as having glare from the BB1 illumination source 1296, those identified as having 
glare from the BB2 illumination source 1298, and those identified as unobstructed tissue 1300. 
[0497] Similarly, Figure 54 shows a graph 1301 depicting mean values and standard deviations 
of broadband reflectance spectral data using the BB2 chacnel light source for regions confirmed 

20 as being obscured by blood 1304, obscured by mucus 1306, obscured by glare from the BB1 
source 1308, obscured by glare from the BB2 source 1310, or unobscured 1312, according to an 
illustrative embodiment of the invention. Figure 54 shows the range of BB2 reflectance values 
1302 for a given category of the sample test points which lie within one standard deviation of the 
mean for the category, plotted as a function of wavelength 1290. Figure 54 shows ranges of BB2 

25 reflectance values 1302 for each of the following categories of sample test points: those 

identified as having blood present 1304, those identified as having mucus present 1306, those 
identified as having glare from the BB1 illumination source 1308, those identified as having 
glare from the BB2 illumination source 1310, and those identified as unobstructed tissue 1312. 
[0498] Figures 53 and 54 show that a region with glare from one illumination source does not 

30 necessarily have high reflectance values corresponding to data obtained using the other 

illumination source. For example, in Figure 53, the range of BB1 reflectance values 1288 of 
points with visual evidence of glare from the BB2 source 1298 is similar to the range of BB1 
reflectance values 1288 of unobstructed tissue 1300. Similarly, in Figure 54, the range of BB2 
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reflectance values 1302 of points demonstrating glare from the BB1 source 1308 is similar to the 
range of BB2 reflectance values 1302 of unobstructed tissue 1312. Therefore, one of the two 
sets of reflectance spectral data may be useful in characterizing the tissue even if the other of the 
two sets is corrupted by an artifact, such as glare. 

[0499] It may also be desirable to determine spectral characteristics caused by various artifacts 
so that data corresponding to a region affected by a given artifact may be identified or to 
determine a spectral characteristic of an artifact based on the spectral data itself, without having 
to rely on other visual evidence of a given artifact In order to determine these spectral 
characteristics, an embodiment of the invention comprises using spectral data known to be 
affected by a given artifact based on visual evidence, as well as spectral data known not to be 
affected by an artifact. Techniques that may be used to identify spectral characteristics and/or to 
develop classification rules determining whether given data are affected by an artifact include, 
for example, discriminant analysis (linear, nonlinear, multivariate), neural networks, principal 
component analysis, and decision tree analysis. One embodiment comprises determining a 
particular wavelength that gives the greatest difference between the artifact-affected spectral data 
(the outlier) and spectral data from corresponding nearby tissue that is known to be unaffected by 
the artifact (the tissue). Alternatively, the embodiment comprises determining a wavelength that 
gives the largest difference between the outlier and the tissue, as weighted by a measure of 
variability of the data. In one embodiment, this method locates where the difference between the 
mean reflectance for the outlier and the tissue is at a maximum relative to the difference between 
the standard deviations for the outlier data and the tissue data. In one embodiment, the method 
determines a maximum value of D as a function of wavelength, where D is the difference given 
in Equation 83 below: 

= (83) 

4cr*{BB{l)) 0A 

where ji(BB(X)) 0 utiier is the mean of a set of reflectance spectral data at wavelength X known to be 
affected by a given artifact, ti(BB(X))r tS sue is the mean of a set of reflectance spectral data at 
wavelength X that is known not to be affected by the artifact, o(BB{X)) 0 utHer is the standard 
deviation of the set of reflectance spectral data at wavelength X known to be affected by the 
given artifact, and o(AB(A))n™* is the standard deviation of the set of reflectance spectral data at 
wavelength X known not to be affected by the given artifact. 
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[0500] Figure 55 shows a graph 1313 depicting the weighted difference 1314 between the 
mean reflectance values of glare-obscured regions and unobscured regions of tissue as a function 
of wavelength 1316, according to an illustrative embodiment of the invention. The weighted 
difference 1314 is as given in Equation 83. For the data sets used in Figure 55, the wavelength 
providing the maximum value 1318 of D in Equation 83 is about 420 nm. Thus, exemplary 
spectral characteristics identifiable with this set of glare-obscured "outlier" data include the 
reflectance spectral data at around 420nm, and any deviation of this data from reflectance 
spectral "tissue" data for unobscured regions of correspondingly similar tissue at around 420nm. 
This embodiment uses reflectance spectral data. Other embodiments may use other types of 
spectral data, including fluorescence data. 

[0501] Figure 56 shows a graph 1319 depicting the weighted difference 1314 between the 
mean reflectance values of blood-obscured regions and unobscured regions of tissue as a 
function of wavelength 1316, according to an illustrative embodiment of the invention. The 
weighted difference is as given in Equation 83. For the data sets used in Figure 56, the 
wavelength providing the maximum value 1320 of D in Equation 83 is about 585 nm. 
[0502] Thus, exemplary spectral characteristics identifiable with this set of blood-obscured 
"outlier" data include the reflectance spectral data at about 585nm, and any deviation of this data 
from reflectance spectral "tissue" data for unobscured regions of correspondingly similar tissue 
at about 585nm. This embodiment uses reflectance spectral data. Other embodiments may use 
other types of spectral data, including fluorescence spectral data. 

[0503] Figure 57 shows a graph 1321 depicting the weighted difference 1314 between the 
mean reflectance values of mucus-obscured regions and unobscured regions of tissue as a 
function of wavelength 1316, according to an illustrative embodiment of the invention. The 
weighted difference is as given in Equation 83. For the data sets used in Figure 57, the 
wavelength providing the maximum value 1322 of D in Equation 83 is about 577 nm. Thus, 
exemplary spectral characteristics identifiable with this set of mucus-obscured "outlier" data 
include the reflectance spectral data at about 577 nm, and any deviation of this data from 
reflectance spectral "tissue" data for unobscured regions of correspondingly similar tissue at 
about 577 nm. This embodiment uses reflectance spectral data. Other embodiments may use 
other types of spectral data, including fluorescence spectral data. 

[0504] One illustrative embodiment comprises determining two wavelengths where the ratio of 
spectral data at the two wavelengths is most different for the artifact-affected spectral data (the 
"outlier") and spectral data from corresponding nearby tissue that is known to be unaffected by 
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the artifact (the "tissue"). Alternatively, the method comprises determining two wavelengths 
where the ratio of spectral data at the two wavelengths weighted by a measure of variability is 
most different for the outlier data and the tissue data. In one embodiment, the method comprises 
determining a maximum value of D as a function of wavelength, where D is the difference given 
5 in Equation 84 below: 

^PiBBpJZSpC + <t>(bb(A)/ ' 

where fi(BB(X)/BB(X *))oumer is the mean of the ratios of reflectance at wavelength X and 
reflectance at wavelength X ' for a set of reflectance spectral data known to be affected by a given 
artifact, ti(BB(X)IBB{X 0)2Wj is the mean of the ratios of reflectance at wavelength X and 

10 reflectance at wavelength X 9 for a set of reflectance spectral data that is known not to be affected 
by the given artifact, a(BB(X)/BB(X Counter is the standard deviation of the ratios of reflectance at 
wavelength X and reflectance at wavelength X 1 for a set of reflectance spectral data known to be 
affected by the given artifact, and o(BB(X)/BB(X ^Tissue is the standard deviation of the ratios of 
reflectance at wavelength X and reflectance at wavelength X 9 for a set of reflectance spectral data 

1 5 known not to be affected by the given artifact. 

[0505] Figure 58 shows a graph 1323 depicting a ratio of the weighted differences 1324 
between the mean reflectance values of glare-obscured regions and unobscured regions of tissue 
at two wavelengths, a numerator wavelength 1326 and a denominator wavelength 1328, 
according to an illustrative embodiment of the invention. The weighted difference 1324 is as 

20 given in Equation 84. For the data sets used in Figure 58, the two wavelengths providing the 
maximum value of D in Equation 84 are about 401 nm (numerator) and about 404 nm 
(denominator). Thus, exemplary spectral characteristics identifiable with this set of glare- 
obscured "outlier" data include the ratio of reflectance spectral data at about 401nm and the 
reflectance spectral data at about 404nm, as well as any deviation of this ratio from those of 

25 corresponding regions of similar but unobscured tissue. This embodiment uses reflectance 
spectral data. Other embodiments may use other types of spectral data, including fluorescence 
data. 

[0506] Figure 59 shows a graph 1325 depicting a ratio of the weighted differences 1324 
between the mean reflectance values of blood-obscured regions and unobscured regions of tissue 
30 at two wavelengths, a numerator wavelength 1326 and a denominator wavelength 1328, 

according to an illustrative embodiment of the invention. The weighted difference is as given in 
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Equation 84. For the data sets used in Figure 59, the two wavelengths providing the maximum 
value of D in Equation 84 axe about 595 nm (numerator) and about 718 nm (denominator). 
Thus, an exemplary spectral characteristic identifiable with this set of blood-obscured "outlier" 
data includes the ratio of the reflectance spectral data at about 595nm and the reflectance spectral 
5 data about 71 8nm. This embodiment uses reflectance spectral data. Other embodiments may 
use other types of spectral data, including fluorescence data. 

[0507] Figure 60 shows a graph 1327 depicting a ratio of the weighted differences 1324 
between the mean reflectance values of mucus-obscured regions and unobscured regions of 
tissue at two wavelengths, a numerator wavelength 1326 and a denominator wavelength 1328, 

10 according to an illustrative embodiment of the invention. The weighted difference is as given in 
Equation 84. For the data sets used in Figure 60, the two wavelengths providing the maximum 
value of D in Equation 84 are about 545 nm (numerator) and about 533 nm (denominator). 
Thus, an exemplary spectral characteristic identifiable with this set of mucus-obscured "outlier" 
data includes the ratio of the reflectance spectral data at about 545nm and the reflectance spectral 

15 data at about 533nm. This embodiment uses reflectance spectral data. Other embodiments may 
use other types of spectral data, including fluorescence data. 

[0508] Another type of lighting artifact which may obscure spectral data is shadow, which may 
be caused, for example, by an obstruction blocking part of the light from an illumination source 
on the optical probe 142 of the embodiment apparatus. It may be important to differentiate 

20 between glare and shadow, so that spectral data representing unobstructed tissue can be properly 
identified. In an embodiment, broadband reflectance is expressed as the intensity of light 
diffusely reflected from a region of the tissue, I t , over the intensity of incident light, Io, at the 
region. When glare is measured in addition to light diffusely reflected from the tissue, a 
percentage of the original intensity of incident light is included in the tissue reflectance 

25 measurement, so that the "reflectance" reading of a region of a sample experiencing glare, R g (\), 
may be expressed as in Equation 85: 

Rg(X) = (IA) + aI 0 (X))/I 0 (X) , (85) 
where a is a real number between 0.0 and 1.0; l£X) is the intensity of light diffusely reflected 
from the region of tissue at wavelength X, and 1 Q (X) is the intensity of light incident on the region 
30 of the sample at wavelength X. The intensity of the specularly-reflected light is aI 0 (X). When 
the region of the sample is shadowed, only a portion of the incident intensity reaches the region. 
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Thus, the "reflectance" reading of a region of a sample experiencing shadow, R S (X) 5 may be 
expressed as in Equation 86: 

R s (^) = pitW/I 0 W. (86) 
where p is a real number between 0.0 and 1.0; l t (X) is the intensity of light at wavelength 
5 X diffusely reflected from the region of tissue with an incident light intensity of Io(X), and I 0 (X) is 
the intensity of light at wavelength A. that would be incident on the region of the sample if 
unshadowed. 

[0509] In one embodiment, the arbitration in step 128 of Figure 1 comprises determining if 
only one set of a pair of sets of spectral data is affected by a lighting artifact, such as glare or 

10 shadow, each set having been obtained using light incident on the sample at a unique angle. If it 
is determined that only one set of a pair of sets of spectral data is affected by the artifact, then the 
other set of spectral data may be used in the determination of a characteristic of the region of the 
sample, for example. In one embodiment, it is determined that there is evidence of a lighting 
artifact in the spectral data. Such evidence may be a large difference between the reflectance 

15 measurements of the two sets of spectral data. If such evidence exists, then one of the 

reflectance measurements will either be R g or R s , as given by Equation 85 and Equation 86. In 
cases where members of only one set are affected by a lighting artifact, the remaining set of 
reflectance measurements may be expressed as R, the intensity of light diflusely reflected from 
the region of the tissue, I t , divided by the intensity of light incident on the region of the tissue, Iq. 

20 In an embodiment method, the larger of the two reflectance measurements corresponding to a 
given wavelength is divided by the smaller. In cases where only one of the sets is affected by a 
lighting artifact, the resulting quotient will be either Rg/R, which is equal to RaIo(X)/It(Ji), or 
R/R s , which is equal to the constant, 1/p. If glare is present, the value of the quotient will depend 
on wavelength and the plot of the quotient as a function of wavelength should look like an 

25 inverted unobstructed tissue broadband signal because of the al 0 (X)/lt(k) term. If shadow is 
present, the plot of the quotient should be constant across the spectrum. 

[0510] Figure 61 shows a graph 1332 depicting as a function of wavelength 1336 mean values 
and confidence intervals of a ratio 1334 of BB1 and BB2 broadband reflectance spectral values 
(larger value divided by smaller value) for regions confirmed as being either glare-obscured or 
30 shadow-obscured tissue, according to an illustrative embodiment of the invention. The shadow 
points 1338 yield a nearly constant value, while the glare points 1340 vary over the range of 
wavelength 1336 in a manner that resembles the inverse of unobstructed tissue reflectance. 



WO 2004/005895 



PCT/US2003/021347 



-127- 

Thus, Figure 61 illustrates an embodiment in which it is determined whether only one set of a 
pair of sets of spectral data is affected by either glare or shadow, such that the other set is 
unaffected by glare or shadow and may be used to determine a characteristic of the tissue, for 
example. In an embodiment, the method comprises differentiating between glare and shadow by 
5 observing the steep slope of glare-affected reflectance spectral measurements between about 
577nm and 599nm, for example, compared to the nearly flat slope of shadow-affected 
reflectance spectral measurements at those wavelengths, as seen in Figure 6 1 . 
[0511] In one embodiment, the arbitration in step 128 of Figure 1 includes applying and/or 
developing spectral artifact classification rules (metrics) using spectral data, including one or 

10 more sets of fluorescence and broadband reflectance data obtained using light at one or more 
angles. In one embodiment, one set of fluorescence data and two sets of reflectance data are 
obtained from a given region of a tissue sample (interrogation point), where each of the two sets 
of reflectance data are obtained using light incident on the region at a different angle. These 
metrics determine what data is representative of a given region of tissue. By varying the metrics, 

15 desired levels of sensitivity and selectivity of a resulting tissue characterization using tissue- 
representative data may be achieved. 

[0512] The following metrics are applied in one embodiment of the arbitration in step 128 of 
Figure 1 and were determined using the embodiments discussed above. These metrics were 
developed using one set of fluorescence data and two sets of reflectance data, BB1 and BB2, for 

20 samples of cervical tissue. Other embodiments use other combinations of spectral data sets. 
Each of the two sets of reflectance data used in the following metrics were obtained using light 
incident to a region of a sample at different angles. An embodiment of the invention uses any or 
all of the metrics listed below to determine if any set of data should be eliminated from use in 
determining a characteristic of a region of tissue, due to the presence of a spectral artifact In an 

25 embodiment of the invention, wavelengths within a range of the wavelengths shown below are 
used. In one embodiment, this range about the wavelengths is about ±10 nm. In an embodiment 
of the invention, only certain parts of the metrics shown below are used. In one embodiment, 
only a portion of a given set of spectral data are eliminated, not the entire set. In one 
embodiment, BB1 and BB2 reflectance data are obtained, but fluorescence data is not. Here, 

30 "eliminate data" means to eliminate data from consideration in an analysis, for example, an 
analysis to determine a condition of a region. It is possible to change sensitivity and selectivity 
of a tissue diagnostic algorithm by varying the metrics below, for instance by changing one or 
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more of the threshold constants. Such variations are within an embodiment of this invention. 
The metrics for one exemplary embodiment are as follows: 
Glare Metric #1: Eliminate BB1 data IF: 
I. {BB1(419)> 0.25 AND BB1(699) > 0.51} OR 

BB1(529)/BB1(543)<1.0; 
OR H. Max{|ABB|/avgBB}(370-710)> 0.25 AND BB1(419)> 0.18 

AND BB1(699)> 0.51 AND 

{BB 1 (576)/BB2(576)}/{BBl (599)/BB2(599)}> 1.1; 
OR HI. Max{|ABB|/avgBB}(370-710)>0.4AND 

{BB1(576)/BB2(576)}/{BB1(599)/BB2(599)}>1.1 AND BB2(699) > 0.3. 

Glare Metric #2: Eliminate BB2 data D7: 

I. {BB2(419)> 0.25 AND BB2(699)> 0.51} OR 

BB2(529)/BB2(543)<1.0; 
OR D. Max{|ABB|/avgBB}(370-710)> 0.25 AND BB2(419)> 0.18 

AND BB2(699) > 0.51 AND 

{BB2(576)/BB1(576)}/{BB2(599)/BB1(599)}>1.1; 
OR HI. Max{|ABB|/avgBB}(370-710)>0.4AND 

{BB2(576)/BB1(576)}/{BB2(599)/BB1(599)}>1.1 AND BB1(699)> 

0.3. 

Shadow Metric #1: Eliminate BB1 data IF: 

I. BB2(499)>BB1(499) AND Max{|ABB|/avgBB}(370-710) > 0.25 AND BB1(499) < 
0.05; 

OR O. Max{|ABB|/avgBB}(370-710) > 0.5 AND 

{BB 1 (576)/BB2(576)}/{BB 1 (599)/BB2(5 99)} <1 . 1 AND 
BB2(576)>BB1(576) AND BB1(419) < 0.2. 
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Shadow Metric #2: EUminate BB2 data IF: 

I. BB1(499)>BB2(499) AND Max{|ABB|/avgBB}(370-710) > 0.25 AND BB2(499) < 
0.05; 

OR II. Max{|ABB|/avgBB}(370-710)>0.5AND 

{BB2(576)/BB1(576)}/{BB2(599)/BB1(599)}<1.1 AND 
BB1(576)>BB2(576) AND BB2(419) < 0.2. 

Low Signal: Eliminate BB1, BB2. and Fl data IF: 

L Fl(479) < 3.5 counts/VJ (where mean fluorescent intensity of 
normal squamous tissue is about 70 counts/|iJ at about 450nm); 
OR II. BB1(499)< 0.035 & BB2(499) < 0.035. 

where BB1(X) is the BB1 reflectance spectrum measurement at wavelength X, BB2(X) is the 
BB2 reflectance spectrum measurement at wavelength X, Max{|ABB|/avgBB}(370-710) 
indicates the maximum of {the absolute value of the difference between the BB1 and BB2 
reflectance spectrum measurements divided by the average of the BB1 and BB2 measurements at 
a given wavelength} over the range of about 370 to 710nm, and F1(X) is the fluorescence 
spectrum measurement at wavelength X. The following are notes regarding the Metrics listed 
above and apply to a preferred embodiment, subject to the variations described above; 
Glare Metric #1 and Glare Metric #2: 

Level I: Broadband measurements are generally greater than about 0.25 at about 419nm 
and greater than about 0.5 1 at about 699 nm only when there is glare in the channel (i.e. 
BB1 or BB2). The lack of a downward slope between about 499 and about 543 nm is 
also a strong indication that the broadband measurements are affected by glare. 
Level II: Large percentage differences in the broadband measurements combined with 
higher than average reflectance at about 419 nm and about 699 nm also indicates the 
presence of glare. The presence of a slope when the broadband measurements at about 
576 nm and about 599 nm are divided is further confirmation that glare is present. 
Level HI: A maximum broadband percent difference that is larger than about 0.4 
indicates that there is a lighting artifact present. The presence of a slope when the 
broadband measurements at about 576 and about 599 nm are divided and an off-channel 
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broadband greater than about 0.3 at about 699 nm reveals that the lighting artifact is due 
to glare instead of shadow. 

If a point is identified as glare in one channel, then subsequently identified as glare in 
5 both channels, both broadband measurements should be eliminated. 

Shadow Metric #1 and Shadow Metric #2: 

Level I: Broadband measurements that are shadowed generally will have a large percent 
difference between BB1 and BB2 and a low reflectance at about 499 nm. 
10 Level II: A maximum broadband percent difference that is larger than about 0.5 indicates 

that there is a lighting artifact present. Lacking a large slope when the broadband 
measurements at about 576 and about 599 nm are divided and an off-channel broadband 
less than about 0.2 at about 419 nm reveals that the point is shadow instead of glare. 

15 Cases where both BB and Fl measurements should be eliminated: 

Low Signal: 

Broadband measurements lower than about 0.035 at about 449 nm or fluorescence 
measurements lower than about 3.5 at about 479 nm indicate that the measurements are 
20 not coming from tissue, but rather from blood, the os, smoke tube, speculum, or another 

obstruction. Sites with significant shadowing in both broadband channels are also 
identified with this metric. Because of the uncertainty of the tissue being measured, the 
reflectance and fluorescence data from that point are assumed invalid, regardless of 
whether it was identified by fluorescence or the broadband channels. 

25 The low signal metric acts as a hard mask because it eliminates a qualifying interrogation 

point from consideration by the classifier or the other masks, such as the spectral masks 
in step 130 of Figure 1. The low signal metric acts as a hard mask, for example, for 
points that have shadowing in both BB1 and BB2. 



30 



[0513] The metrics used in this embodiment of step 128 of Figure 1 include a low signal 
metric, which detects spectral data affected by obstruction artifacts such as blood, a speculum, a 
smoke tube, or other obstruction. This metric also identifies regions where both sets of 
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broadband reflectance data are affected by shadow. These were combined into one low signal 
metric in this embodiment, since regions affected by these artifacts exhibit similar 
characteristics, such as low fluorescence and low broadband reflectance measurements. 
[0514] Figure 62 shows a graph 1342 depicting broadband reflectance 1344 as a function of 

5 wavelength 1346 for the BB1 channel 1348 and the BB2 channel 1350 measurements for a 

region of tissue where the BB1 data is affected by glare but the BB2 data is not, according to an 
illustrative embodiment of the invention. The glare leads to a higher value of reflectance 1344 
than that of surrounding unaffected tissue. By applying the metrics listed above in step 128 of 
Figure 1, it is determined that the exemplary BB1 set of spectral data shown in Figure 62 is 

10 affected by glare and is thus not suitably representative of this region of the tissue sample. 

Applying the metrics of step 128 also determines that the BB2 set of spectral data is potentially 
representative of this region of the sample (unaffected by an artifact), since it is not eliminated. 
One embodiment comprises using this representative data in step 132 of Figure 1 to determine a 
condition of this region of the sample, for example, the state of health. 

15 [0515] Figure 63 shows a graph 1351 depicting broadband reflectance 1344 as a function of 
wavelength 1346 for the BB1 channel 1352 and the BB2 channel 1354 broadband reflectance 
spectral data for a region of tissue where the BB2 data is affected by shadow but the BB1 data is 
not, according to an illustrative embodiment of the invention. The shadow leads to a lower value 
of reflectance 1 344 than that of surrounding unaffected tissue. By applying the metrics listed 

20 above in step 128 of Figure 1 , it is determined that the exemplary BB2 set of spectral data shown 
in Figure 63 is affected by shadow and is therefore not suitably representative of this region of 
the tissue sample. Applying the metrics of step 128 also leads to the determination that the BB1 
set of spectral data is potentially representative of this region of the sample, since the BB 1 set of 
data is not eliminated. One embodiment comprises using this representative data in step 132 of 

25 Figure 1 to determine a condition of this region of the sample, for example, the state of health. 
[0516] Figure 64 shows a graph 1358 depicting broadband reflectance 1360 as a function of 
wavelength 1362 for the BB1 channel 1364 and the BB2 channel 1366 measurements for a 
region of tissue that is obscured by blood, according to an illustrative embodiment of the 
invention. By applying the metrics listed above, it is determined that blood is present, and that 

30 both the BB1 and the BB2 sets of spectral data are considered unrepresentative of this region of 
the tissue sample. 

[0517] Figure 65 shows a graph 1367 depicting broadband reflectance 1 360 as a function of 
wavelength 1362 for the BB1 channel 1368 and the BB2 channel 1370 measurements for a 
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region of tissue that is unobscured, according to an illustrative embodiment of the invention. 
Applying this method determines that neither set of spectral data is affected by an artifact, and, 
therefore, either is representative of the tissue sample. One embodiment comprises using an 
average value 1372 of the BB1 and BB2 measurements at each wavelength to represent the 
5 region of the tissue sample in determining a condition of this region, for example, the state of 
health of the region, in step 132 of Figure 1. 

[0518] Application of the metrics listed above was performed using various tissue types to 
verify the sensitivity and specificity of the metrics. While, in one embodiment, it is undesirable 
to eliminate good spectral data of normal tissue, it is worse to eliminate good spectral data of 

10 diseased tissue, particularly if it is desired to use the data in the classification of the state of 
health of a region of tissue. The following tissue types were used in the verification: tt-132 
(metaplasia by impression), tt-155 (normal by impression), tt-117 (blood), NEDpath (no 
evidence of disease confirmed by pathology), and cin23all (CIN 2/3 diseased tissue). Table 5 
shows the number of points (regions) corresponding to each of these tissue types, the 

15 determinations from the metrics listed above for these points, and the number of points where 
one set of broadband reflectance spectral data were eliminated, where both sets of broadband 
reflectance spectral data were eliminated, and where both reflectance and fluorescence spectral 
data were eliminated. 



Table 5: Verification of Metrics 



Tissue Type 


cin23all 


nedpath 


tt-117 


tt-132a 


tt-155 


Total pts. 


477 


919 


175 


5000 


2016 


Low Signal 


2 


14 


126 


2 


0 


Glare in BB1 


7 


30 


4 


122 


26 


Glare in BB2 


9 


40 


9 


134 


16 


Glare in both 


3 


5 


1 


15 


5 


Shadow in BB1 


47 


35 


4 


165 


132 


Shadow in BB2 


16 


37 


24 


359 


32 


One BB Removed{%) 


16.6 


15.5 


23.4 


15.6 


10.2 


Both BB Removed(%) 


1.05% 


2.07% 


72.57% 


0.34% 


0.25% 


Fl Removed(%) 


0.42 


1.52 


72.00 


0.04 


0.00 



[0519] For the regions (points) corresponding to CIN 2/3 diseased tissue, no broadband 
reflectance measurements were unnecessarily eliminated from the set using the above metrics. 
The points identified as being low signal were all located on the os. All points that were 
identified by the metric as shadow were verified as being correct, and only one point identified 
25 as glare was incorrect, 

[0520] For the nedpath points (no evidence of disease), only two tissue points were 
unnecessarily eliminated after being misidentified as mucus. A point that was actually dark red 
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tissue with glare was incorrectly identified as shadow in BB2. The points that were identified as 
glare were verified as being correct. 

[0521] Out of the 175 blood points, 126 were identified as being low signal. The glare points 
and shadow points were accurate. 
5 [0522] Out of the 5000 points in the metaplasia by impression group, there were no valid tissue 
points lost The data set was improved by eliminating about 800 readings of points affected by 
either glare or shadow. 

[0523] Out of the 2016 normal by impression points, no measurements were unnecessarily 
removed from the set. 

10 [0524] Figure 66 shows a graph 1 374 depicting the reduction in the variability of broadband 
reflectance measurements 1376 of ON 2/3-confirmed tissue produced by filtering (eliminating 
non-representative spectral data) using the metrics of step 128 in Figure 1 described above, 
according to an illustrative embodiment of the invention. The graph 1374 depicts mean values 
and standard deviations of broadband reflectance spectral data before and after filtering. 

15 [0525] Figure 67 shows a graph 1378 depicting the reduction in the variability of broadband 
reflectance measurements 1376 of tissue classified as "no evidence of disease confirmed by 
pathology" produced by filtering using the metrics described above, according to an illustrative 
embodiment of the invention. The graph 1378 depicts mean values and standard deviations of 
broadband reflectance spectral data before and after filtering. 

20 [0526] Figure 68 shows a graph 1380 depicting the reduction in the variability of broadband 
reflectance measurements 1376 of tissue classified as "metaplasia by impression" produced by 
filtering using the metrics described above, according to an illustrative embodiment of the 
invention. The graph 1380 depicts mean values and standard deviations of broadband 
reflectance spectral data before and after filtering. 

25 [0527] Figure 69 shows a graph 1 3 82 depicting the reduction in the variability of broadband 
reflectance measurements 1376 of tissue classified as "normal by impression" produced by 
filtering using the metrics described above, according to an illustrative embodiment of the 
invention. The graph 1382 depicts mean values and standard deviations of broadband 
reflectance spectral data before and after filtering. 

30 [0528] Figure 70 A depicts an exemplary image of cervical tissue 1388 divided into regions for 
which two types of reflectance spectral data and one type of fluorescence spectral data are 
obtained, according to one embodiment of the invention. Figure 70B is a representation 1398 of 
the regions depicted in Figure 70 A and shows the categorization of each region using the metrics 



WO 2004/005895 



PCI7US2003/021347 



-134- 

in step 128 of Figure 1. The black-highlighted sections 1390 of the image 1388 in Figure 70A 
correspond to points (regions) that had both reflectance measurements eliminated by application 
of the embodiment method. Many of the lower points 1392, as seen in both Figures 70 A and 
70B, are in shadow because the speculum obstructs the view of one of the channels. Glare is 
5 correctly identified prominently at the upper one o'clock position 1 394. Since there are blood 
points on the shadowed section, some are labeled blood (low signal) and others are treated as 
shadow. 

[0529] Figure 71 A depicts an exemplary image of cervical tissue 1402 divided into regions for 
which two types of reflectance spectral data and one type of fluorescence spectral data are 

10 obtained, according to one embodiment of the invention. Figure 71B is a representation 1406 of 
the regions depicted in Figure 71 A and shows the categorization of each region using the metrics 
in step 128 of Figure 1 . Figures 71 A and 71B show an example of a cervix that has a large 
portion of the lower half 1404 affected by shadow. However, only one of the sets of reflectance 
spectral data (BB2) is affected by the shadow artifact. The BB1 reflectance spectral data is not 

15 affected by shadow. Applying the metrics above, the BB1 data are used to describe these 
regions, while the BB2 data are eliminated from consideration. The accuracy of tissue 
characterization using the reflectance measurements should be improved significantly for this 
patient using the arbitration metrics of step 128 of Figure 1, since the more accurate broadband 
measurements will be used in later characterization steps instead of simply averaging the two 

20 broadband measurements, which would skew the measurements due to a lighting artifact. 

[0530] Figure 72 A depicts an exemplary image of cervical tissue 1410 divided into regions for 
which two types of reflectance spectral data and one type of fluorescence spectral data are 
obtained, according to an illustrative embodiment of the invention. Figure 72B is a 
representation 1416 of the regions depicted in Figure 72 A and shows the categorization of each 

25 region using the metrics in step 128 of Figure 1 . Figures 72A and 72B show an image with a 
portion 1412 that is shadowed and off of the cervix. Due to an obstruction from the smoke tube 
in the upper part of the image, there are many low signals. Even though much of the cervix is 
shadowed in BB1 1414, there are still some BB2 and fluorescence readings usable in later tissue 
classification steps. 

30 Classification system overview 

[0531] The tissue characterization system 100 of Figure 1 combines spectral data and image 
data obtained by the instrument 102 to characterize states of health of regions of a tissue sample. 
In one embodiment, the spectral data are first motion-tracked 106, preprocessed 114, and 
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arbitrated 128 before being combined with image data in step 132 of Figure 1. Likewise, in one 
embodiment, the image data are first focused 122 and calibrated 124 before being combined with 
spectral data in step 132 of Figure 1 . Each of these steps are discussed in more detail herein. 
[0532] Figure 73 shows how spectral data and image data are combined in the tissue 
5 characterization system of Figure 1, according to one embodiment. The block diagram 1420 of 
Figure 73 depicts steps in processing and combining motion-tracked 106, preprocessed 114, and 
arbitrated 128 spectral data with focused 122, calibrated 124 image data to determine states of 
health of regions of a tissue sample. After preprocessing 114, spectral data from each of the 
interrogation points (regions) of the tissue sample are arbitrated in step 128 of Figure 73. In the 

10 embodiment shown, a fluorescence spectrum, F, and two broadband reflectance spectra, BB1 
and BB2, are used to determine one representative reflectance spectrum, BB, used along with the 
fluorescence spectrum, F, for each interrogation point. This is depicted in Figure 73 as three 
heavy arrows representing the three spectra - BB1, BB2, and F - entering arbitration block 128 
and emerging as two spectra - BB and F. Block 128 of Figure 73 also applies an initial low- 

15 signal mask as a first pass at identifying obscured interrogation points, discussed previously 
herein. 

[0533] In the embodiment of Figure 73, the arbitrated broadband reflectance spectrum, BB, is 
used in the statistical classification algorithm 134, while both the broadband reflectance 
spectrum, BB, and the fluorescence spectrum, F, as well as the image data, are used to determine 

20 heuristic-based and/or statistics-based metrics, or "masks", for classifying the state of health of 
tissue at interrogation points. Masking can be a means of identifying data that are potentially 
non-representative of the tissue sample. Potentially non-representative data includes data that 
may be affected by an artifact or obstruction such as blood, mucus, fluid, glare, or a speculum. 
Such data is either hard-masked or soft-masked. Hard-masking of data includes identifying 

25 interrogation points at which the data is not representative of unobscured, classifiable tissue. 

This results in a characterization of "Indeterminate" at such an interrogation point, and no further 
computations are necessary for that point. Soft-masking includes applying a weighting function 
or weighting factor to identified, potentially non-representative data. The weighting is taken into 
account during calculation of disease probability and may or may not result in an indeterminate 

30 diagnosis at the corresponding tissue region. Soft-masking provides a means of weighting 
spectral and/or image data according to the likelihood that the data is representative of clear, 
unobstructed tissue in a region of interest. In the embodiment shown in Figure 73, both hard 
masks and soft masks are determined using a combination of spectral data and image data. 
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Furthermore, the masks of Figure 73 use spectral and image data to identify interrogation points 
that are not particularly of interest in the exam, such as the vaginal wall, smoke tube tissue, the 
os, or tissue outside the region of interest. 

[0534] In addition to determining data that are potentially non-representative of regions of 
5 interest, the masks shown in Figure 73 also include masks that determine where the data is 
highly indicative of necrotic tissue or disease-free (NED) tissue. It has been discovered that 
necrotic tissue and disease-free tissue are often more predictably determined by using a heuristic 
metric instead of or in combination with a statistical classifier than by using a statistical classifier 
alone. For example, one embodiment uses certain values from fluorescence spectra to determine 
10 necrotic regions, since fluorescence spectra can indicate the FAD/NADH component and 
porphyrin component of necrotic tissue. Also, an embodiment uses prominent features of 
fluorescence spectra indicative of normal squamous tissues to classify tissue as "NED" (no 
evidence of disease) in the spectral mask. 

[0535] Identifying necrotic and NED regions at least partially by using heuristic metrics allows 
15 for the development of statistical classifiers 134 that concentrate on differentiating tissue less 
conducive to heuristic classification - for example, statistical classifiers that differentiate high 
grade cervical intraepithelial neoplasia (i.e. CIN 2/3) from low grade neoplasia (i.e. CIN 1) and 
healthy tissue. 

[0536] In Figure 73, step 130 uses the arbitrated spectra, BB and F, to determine four spectral 
20 masks - NED spec (no evidence of disease), Necrosis spec , [CE] spcc (cervical edge/ vaginal wall), 
and [MU] sp ec (mucus/fluid). The focused, calibrated video data is used to determine nine image 
masks - Glares, Mucus V id, Bloody, Os vid , [ROI]vid (region of interest), [ST] V id (smoke tube), 
[SPJvid (speculum), [VW] V i d (vaginal wall), and [FL] vi d (fluid and foam). Step 1422 of Figure 73 
combines these masks to produce a hard "indeterminate" mask, a soft "indeterminate" mask, a 
25 mask identifying necrotic regions, and a mask identifying healthy (NED) regions. In the 
embodiment of Figure 73, steps 1424 and 1426 apply the necrotic mask and hard 
"indeterminate" mask, respectively, prior to using the broadband spectral data in the statistical 
classifiers 134, while steps 1428 and 1430 apply the soft "indeterminate" mask and the NED 
mask after the statistical classification step 134. 
30 [0537] The embodiment shown in Figure 73 can classify each interrogation point in step 1432 
as necrotic, CIN 2/3, NED, or Indeterminate. There may be some post-classification processing 
in step 1434, for example, for interrogation points having a valid fluorescence signal but having 
both broadband signals, BB1 and BB2, eliminated by application of the arbitration metrics in 
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step 128. The embodiment in Figure 73 then uses the final result to create a disease display 
overlay of a reference image of the tissue sample in step 138. Each of the masking and 
classification steps summarized above are discussed in more detail herein. 
[0538] In one alternative embodiment, the statistical classifiers in step 134 of Figure 73 
5 additionally include the use of fluorescence, image, and/or kinetic data. One alternative 

embodiment includes using different sets of spectral and/or image masks than those in Figure 73 . 
Also, one alternative embodiment includes using a different order of application of heuristic 
masks in relation to one or more statistical classifiers. In one alternative embodiment, kinetic 
data is determined by obtaining intensity data from a plurality of images captured during a tissue 

10 scan, determining a relationship between corresponding areas of the images to reflect how they 
change with time, and segmenting the images based on the relationship. For example, an 
average kinetic whitening curve may be derived for tissue areas exhibiting similar whitening 
behavior. Whitening kinetics representative of a given area may be compared to reference 
whitening kinetics indicative of known states of health, thereby indicating a state of health of the 

15 given area. In one alternative embodiment, the kinetic image-based data may be combined with 
spectral data to determine states of health of regions of a tissue sample. 
[0539] Figure 74 shows a block diagram 1438 depicting steps in the method of Figure 73 in 
further detail. The steps of Figure 74 are summarized below and are discussed in detail 
elsewhere herein. Steps 1440, 1442, 1444, and 1446 in Figure 74 depict determination of the 

20 spectral masks from the arbitrated broadband reflectance and fluorescence signals, as seen in 
step 130 of Figure 73. Steps 1448, 1450, 1452, 1454, 1456, 1458, 1460, 1462, and 1464 in 
Figure 74 depict determination of the image masks from the focused, calibrated video data, as 
seen in step 108 of Figure 73. The lines extending below these mask determination steps in 
Figure 74 show how (in one embodiment) the masks are combined together, as indicated in step 

25 1422 of Figure 73. Steps 1466, 1468, 1470, 1472, 1474, 1476, 1478, and 1480 of Figure 74 
shows which masks are combined. Also important is the manner in which the masks are 
combined, disclosed in the detailed step explanations herein. 

[0540] The statistical classification step 134 from Figure 73 is shown in Figure 74 as steps 
1482, 1484, and 1486. Here, the pictured embodiment applies a necrosis mask 1424 and a hard 
30 "indeterminate" mask 1426 to the arbitrated broadband spectral data to eliminate the need to 
further process certain necrotic and indetenninate interrogation points in the classification step. 
Classification includes processing of broadband spectral data via wavelength region truncation, 
wavelength subsampling, and/or mean-centering. The processed data is then used in two 
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different feature extraction methods. These include a principal component analysis (PCA) 
method used in the DASCO classifier step 1484 (Discriminant Analysis with Shrunken 
Covariances) and a feature coordinate extraction (FCE) method used in the DAFE classifier step 
1482 (Discriminant Analysis Feature Extraction). Each of steps 1484 and 1482 extract a lower 
dimensional set of features from the spectral data that is then used in a Bayes' classifier to 
determine probabilities of classification in one or more tissue-class/state-of-health categories. 
The classification probabilities determined in steps 1482 and 1484 are combined in step 1486. 
Each of the classifiers in steps 1482 and 1484 are specified by a set of parameters that have been 
determined by training on known reference data. One embodiment includes updating the 
classifier parameters as additional reference data becomes available. 

Spectral masking 

[0541] The invention comprises determining spectral masks. Spectral masks identity data 
from a patient scan that are potentially non-representative of regions of interest of the tissue 
sample. Spectral masks also identify data that are highly indicative of necrotic tissue or normal 
squamous (NED) tissue. In one embodiment, the spectral masks are combined as indicated in 
the block flow diagram 1438 of Figure 74, in order to account for the identification of spectrally- 
masked interrogation points in the tissue-class/state-of-health classification step 1432. Steps 
1440, 1442, 1444, and 1446 in Figure 74 depict the determination of spectral masks from the 
arbitrated broadband reflectance and fluorescence spectra obtained during a patient scan and are 
discussed in more detail below. 

[0542] Step 1440 in Figure 74 depicts the determination of an NED spe c (no evidence of 
disease) spectral mask using data from the fluorescence spectrum, F, and the broadband 
reflectance spectrum, BB, at each of the interrogation points of the scan pattern, following the 
arbitration and low-signal masking step 128. Applying the NEDspec mask reduces false positive 
diagnoses of CIN 2/3 resulting from the tissue-class/state-of-health classification step 134 in 
Figure 1 (and Figure 89). The NED spec mask identifies tissue having optical properties distinctly 
different from those of CIN 2/3 tissue. More specifically, in one embodiment, the NED spe c mask 
uses differences between the fluorescence signals seen in normal squamous tissue and CIN 2/3 
tissue. These differences are not accounted for by tissue-class/state-of-health classifiers based on 
broadband reflectance data alone. For example, the NED spec mask uses the collagen peak seen in 
the fluorescence spectra of normal squamous tissue at about 410 nm to distinguish normal 
squamous tissue from CIN 2/3 tissue. 
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[0543] Figure 75 shows a scatter plot 1 500 depicting discrimination between regions of normal 
squamous tissue and CIN 2/3 tissue for a set of known reference data, according to one 
embodiment. Plotting fluorescence intensity at 460 nm (y-axis, 1502) against a ratio of 
fluorescence intensity, F(505 nm)/F(410 nm), (x-axis, 1504) provides good discrimination 
5 between regions known to be normal squamous tissue (blue points in Figure 75) and regions 
known to be CIN 2/3 tissue (red points in Figure 75). One component of the NED spe c 
discrimination metric is shown by line 1506 in Figure 75, which divides a region of the plot that 
is predominately representative of normal squamous tissue (1508) from a region of the plot that 
is predominately representative of CIN 2/3 tissue (1510). The divider 1506 can be adjusted, for 
10 example, to further reduce false positives or to allow detection of more true positives at the 
expense of increased false positives. 

[0544] In one embodiment, the fluorescence over reflectance ratio at about 430 nm is also 
included in the NED spec metric to determine normal columnar tissue sites that may not be 
identified by the component of the metric illustrated in Figure 75 (i.e. blue points on the right of 
15 line 1506). It is found that fluorescence of CIN 2/3 tissue at about 430 nm is lower relative to 
normal tissue, while CIN 2/3 reflectance at about 430 nm is higher relative to normal tissue, after 
application of a contrast agent such as acetic acid. 

[0545] Figure 76 shows a graph 1512 depicting as a function of wavelength 1514 the mean 
broadband reflectance values 1516 for a set of known normal squamous tissue regions 1518 and 

20 a set of known CIN 2/3 tissue regions 1520, used in one embodiment to determine an additional 
component of the NED sp ec spectral mask. Figure 77 shows a graph 1 522 depicting as a function 
of wavelength 1524 the mean fluorescence intensity values 1526 for the set of known squamous 
tissue regions 1528 and the set of known CIN 2/3 tissue regions 1530. The difference between 
curves 1 528 and 1 5 30 in Figure 77 is pronounced. Thus, a term is included in the NED spec 

25 metric based on the best ratio of wavelengths found to maximize values of D in the 
discrimination equation, Equation 87, below: 

where \i indicates mean and o indicates standard deviation. Figure 78 shows a graph 1532 
depicting values of D in Equation 87 using a range of numerator wavelengths 1536 and 
30 denominator wavelengths 1538. According to the graph 1532 in Figure 78, values of D are 

maximized using the fluorescence ratio F(450 nm)/F(566 nm). Alternately, other combinations 
of numerator wavelength and denominator wavelength may be chosen. 
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[0546] A scatter plot depicting discrimination between regions of normal squamous tissue and 
CIN 2/3 tissue for a set of known reference data are produced by comparing the ratio F(450 
nm)/F(566 ran) to a threshold constant. Then, a graph of true positive ratio (TPR) versus false 
positive ratio (FPR) in the discrimination between regions of normal squamous tissue and CIN 
5 2/3 tissue are obtained using a threshold constant. For example, a TPR of 65% and an FPR of 
0.9% is obtained using a threshold constant of 4.51. The ratio of false positives may be reduced 
by adjusting the threshold. 

[0547] Therefore, in one embodiment, the NED spec mask combines the following three metrics: 

F(430)/BB(430)>xi (88) 

10 F(450)/F(566)>x 2 (89) 

F(460)>x 3 • F(505)/F(410) - X4 (90) 
where Xi, x 2 , x 3 , and x» are constants chosen based on the desired aggressiveness of the metric. 
Equations 88-90 account for the distinguishing features of spectra obtained from regions of 
normal squamous tissue versus spectra from CIN 2/3 tissue regions, as discussed above. 

15 [0548] Figures 79A-D illustrate adjustment of the components of the NED spec mask metric 

shown in Equations 88, 89, and 90. Figure 79A depicts a reference image of cervical tissue 1554 
from a patient scan in which spectral data is used in arbitration step 128, in NED sp ec spectral 
masking, and in statistical classification of interrogation points of the tissue sample. Figure 79B 
is a representation (obgram) 1556 of the interrogation points (regions) of the tissue sample 

20 depicted in the reference image 1 554 of Figure 79A and shows points that are "masked" 
following application of Equation 90. The obgram 1556 of Figure 79B shows that some 
additional interrogation points are masked as NED tissue by adjusting values of x 3 and X4 in 
Equation 90 from {x 3 = 120, X4 = 42} to {x 3 = 115, X4 = 40}. Figure 79C shows interrogation 
points that are "masked" following application of Equation 89. The obgram 1570 of Figure 79C 

25 shows that a few additional points are masked as NED tissue by adjusting the value of x 2 from 
4.0 to 4.1 . Figure 79D shows interrogation points that are masked following application of 
Equation 88. The obgram 1584 of Figure 79D shows that a few additional points are masked as 
NED tissue by adjusting the value of xi from 610 to 600. 

[0549] In one embodiment values of x u x 2 , x 3 , and x» in Equations 88, 89, and 90 are 
30 determined using multidimensional unconstrained nonlinear minhnization. In one embodiment, 
the overall NED spe c metric that results is as follows: 

F(430)/BB(430) > 600 ct/uJ OR 
F(450)/F(566)>4.1 OR 
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F(460)>1 15 • F(505)/F(410) - 40 
where the mean fluorescent intensity of normal squamous tissue is about 70 counts/uJ at about 
450nm. 

[0550] Step 1 442 in Figure 74 depicts the determination of Necrosis spec , a necrotic tissue 
spectral mask, using data from the fluorescence spectrum, F, at each of the interrogation points 
of the scan pattern, following the arbitration and low-signal masking step 128. Unlike the other 
spectral masks (steps 1440, 1442, and 1446 in Figure 74), which are designed to reduce false 
positive diagnoses of CIN 2/3, the Necrosis spC c mask identifies areas of necrotic tissue, thereby 
identifying patients with fairly advanced stages of invasive carcinoma 
[0551] In one embodiment, the Necrosis spec mask uses prominent features of the fluorescence 
spectra from a set of known necrotic regions to identify necrotic tissue. For example, in one 
embodiment, the Necrosis spe o mask uses the large porphyrin peaks of necrotic tissue at about 635 
nm and/or at about 695 run in identifying necrotic tissue. Figure 80 shows a graph 1598 
depicting fluorescence intensity 1600 as a function of wavelength 1602 from an interrogation 
point confirmed as invasive carcinoma by pathology and necrotic tissue by impression, while 
Figure 81 shows a graph 1612 depicting broadband reflectance spectra BB1 and BB2 for the 
same point. 

[0552] The graph 1598 of Figure 80 shows the distinctive porphyrin peaks at reference 
numbers 1604 and 1606. Concurrent with high porphyrin fluorescence at necrotic regions is a 
smaller peak at about 510 nm (label 1608), possibly due to flavin adenine dinucleotide (FAD), 
with an intensity greater than or equal to that of nicotinamide adenine dinucleotide (NADH) at 
about 450 nm (label 1610). The FAD/NADH ratio is a measure of ischemia and/or hypoxia 
indicative of advanced stages of cancer. 

[0553] Thus, in one embodiment, the overall Necrosis spe c metric has one or more components 
indicative of FAD/NADH and one or more components indicative of porphyrin. In one 
embodiment, the Necrosis spee metric is as follows: 

F(510nm)/F(450nm)>1.0 AND 

F(635nm)/F(605nm)>1.3 AND 

F(635nm)/F(660nm)>1.3 AND 

F(635nm)>20ct/uJ 

where mean fluorescent intensity of normal squamous tissue is about 70 counts/uJ at about 
450nm, and where the first line of the metric indicates FAD/NADH (FAD) and the remainder of 
the metric indicates porphyrin. This metric requires all components to be satisfied in order for a 
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region of tissue to be classified as necrotic. In one embodiment, the combination is needed to 
reduce false necrosis diagnoses in patients. The presence of porphyrin does not always indicate 
necrosis, and necrosis masking based solely on the detection of porphyrin may produce an 
unacceptable number of false positives. For example, porphyrin may be present due to 
5 hemoglobin breakdown products following menses or due to systemic porphyrin resulting from 
medications, bacterial infection, or porphyria. Thus, the presence of both porphyrin and the 
indication of FAD must both be determined in order for a region to be identified as necrotic by 
the Necrosisspec metric in the embodiment described above. 

[0554] Figure 82 A depicts a reference image 1 6 1 8 of cervical tissue from the scan of a patient 

10 confirmed as having advanced invasive cancer, in which spectral data is used in arbitration step 
128, in Necrosis S pec spectral masking, and in statistical classification 134 of interrogation points 
of the tissue sample. Figure 82B is an obgram 1620 of the interrogation points (regions) of the 
tissue sample depicted in Figure 82A and shows points that are identified by application of the 
FAD component of the Necrosis spe c metric above (1628), as well as points that are identified by 

1 5 application of the porphyrin component of the Necrosis spec metric above (1 626). The overall 
Necrosisspec mask above identifies points as necrotic only when both FAD and porphyrin are 
identified. In Figure 82B, interrogation points that are marked by both a blue dot (FAD 1626) 
and a green ring (porphyrin 1626) are identified as necrotic tissue by application of the 
Necrosis S pec metric above. 

20 [0555] Step 1444 in Figure 74 depicts the determination of a cervical edge/vaginal wall 
spectral mask ([CE] spec ) using data from the fluorescence spectrum, F, and the broadband 
reflectance spectrum, BB, of each interrogation point of a scan, following the arbitration and 
low-signal masking step 128. The [CE] spec mask identifies low-signal outliers corresponding to 
the cervical edge, os, and vaginal wall, which, in one embodiment, are regions outside an area of 

25 diagnostic interest for purposes of the tissue characterization system 1 00 of Figure 1 . 

[0556] Figures 83, 84, 85, and 86 compare broadband reflectance and fluorescence spectra of 
cervical edge and vaginal wall regions to spectra of CIN 2/3 tissue. In one embodiment, these 
comparisons are used in a discrimination analysis to determine a [CE] spe c spectral mask. Figure 
83 shows a graph 1638 depicting as a function of wavelength 1640 the mean broadband 

30 reflectance values 1642 for a set of known cervical edge regions 1644 and a set of known CIN 
2/3 tissue regions 1646. Figure 84 shows a graph 1648 depicting as a function of wavelength 
1650 the mean fluorescence intensity values 1652 for the set of known cervical edge regions 
1654 and the set of known CIN 2/3 tissue regions 1656. Figure 85 shows a graph 1658 depicting 



WO 2004/005895 



PCT/US2003/021347 



-143- 

as a function of wavelength 1660 the mean broadband reflectance values 1662 for a set of known 
vaginal wall regions 1664 and a set of known CIN 2/3 tissue regions 1666. Figure 86 shows a 
graph 1668 depicting as a function of wavelength 1670 the mean fluorescence intensity values 
1672 for the set of known vaginal wall regions 1674 and the set of known CIN 2/3 tissue regions 
5 1676. 

[0557] In one embodiment, features of the curves in Figures 83, 84, 85, and 86 are used in 
determining the [CE] spe c spectral mask metric. For example, from Figures 84 and 86, it is seen 
that reflectance values for cervical edge/vaginal wall regions are lower than CIN 2/3 reflectance, 
particularly at about 450 nm and at about 700 nm. From Figures 84 and 86, it is seen that there 

10 is a "hump" in the fluorescence curves for cervical edge regions 1654 and vaginal wall regions 
1674 at about 400 nm, where there is no such hump in the CIN 2/3 curve (1656/1676). This 
causes the ratio of fluorescence intensity, F(530 nm)/F(410 nm), to be low at cervical 
edge/vaginal wall regions, relative to that of CIN 2/3 regions. From Figure 86, the mean 
fluorescence intensity of vaginal wall regions 1674 is lower than that of CIN 2/3 regions at least 

15 from about 500 nm to about 540 nm. In one embodiment, these observations are combined to 
determine the overall [CE] spet mask metric as follows: 

BB(450 nm) ■ BB(700 nm)/BB(540 nm) < 0.30 OR 
F 2 (530 nm)/F(410 nm) < 4.75. 
The top line of the metric above reflects the observation that the mean reflectance of cervical 

20 edge/vaginal wall tissue is comparable to that of CIN 2/3 tissue at about 540 nm and lower than 
that of CIN 2/3 tissue at about 450 nm and about 700 nm. The bottom line of the metric above 
reflects the observation that the fluorescence of a cervical edge/vaginal wall region may have a 
lower fluorescence at 530 nm than CIN 2/3 tissue and that the cervical edge/vaginal wall region 
may have a lower F(530 nm)/F(410 nm) ratio than CIN 2/3 tissue. 

25 [0558] Figure 87 A depicts a reference image 1678 of cervical tissue from a patient scan in 

which spectral data is used in arbitration and [CE] spC c spectral masking. Figure 87B is an obgram 
1680 of the interrogation points (regions) of the tissue sample depicted in Figure 87 A and shows, 
in yellow (1684), the points that are "masked" by application of the [CE] spC c metric above. 
White points (1682) in Figure 87B indicate regions that are filtered out by the arbitration and 

30 low-signal mask of step 128, while pink points (1686) indicate regions remaining after 

application of both the arbitration/low-signal mask of step 128 as well as the [CE] spec spectral 
mask. 
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[0559] Step 1446 in Figure 74 depicts the determination of a fluids/mucus (PMU] spC c) spectral 
mask using data from the broadband reflectance spectrum, BB, at each interrogation point of the 
tissue sample following the arbitration and low-signal masking step 128. In one alternate 
embodiment, the fluorescence spectrum is used in place of or in addition to the broadband 
5 reflectance spectrum. The [MU]spe C mask identifies tissue sites covered with thick, opaque, and 
light-colored mucus, as well as fluid that is pooling in the os or on top of the speculum during a 
patient scan. 

[0560] Figures 88, 89, 90, and 91 show steps in an exemplary discrimination analysis to 
determine a [MU] sp ec spectral mask. Figure 106 shows a graph 1688 depicting as a function of 

10 wavelength 1 690 the mean broadband reflectance values 1 692 for a set of known pooling fluids 
regions 1694 and a set of known CIN 2/3 tissue regions 1696. Figure 89 shows a graph 1697 
depicting as a function of wavelength 1698 the mean fluorescence intensity values 1700 for the 
set of known pooling fluids regions 1702 and the set of known CIN 2/3 tissue regions 1704. The 
difference between curves 1694 and 1696 in Figure 88 is pronounced. Thus, in one embodiment, 

15 a term is included in the [MU] S p ec mask metric based on the best ratio of wavelength found to 
maximize values of D in the discrimination equation, Equation 91, as follows: 
r \m{BB{Z)I BB{l% tttU „ - M {BB{X)I BB{A%^\ 
4cT\BB{x)lBB{x)) 0umt ^a'{BB{X)lBB{v))^ 

In one embodiment, values of D above are maximized using the broadband reflectance ratio 
BB(594nm)/BB(610nm). 

20 [0561] A scatter plot depicting discrimination between pooling fluids regions and CIN 2/3 

tissue regions for a set of known reference data are obtained by comparing the ratio of arbitrated 
broadband intensity, BB(594 nm)/BB(610 nm) to a threshold constant. Then, a graph of true 
positive ratio (TPR) versus false positive ratio (FPR) in the discrimination between pooling 
fluids regions and CIN 2/3 tissue regions are obtained using a threshold constant. For example, a 

25 TPR of 56.3% and an FPR of 0.9% is obtained using a threshold constant of 0.74. The ratio of 
false positives may be reduced by adjusting the threshold. 

[0562] Figure 90 shows a graph 1722 depicting as a function of wavelength 1724 the mean 
broadband reflectance values 1726 for a set of known mucus regions 1728 and a set of known 
CIN 2/3 tissue regions 1730. Figure 91 shows a graph 1732 depicting as a function of 
30 wavelength 1 734 the mean fluorescence intensity values 1 736 for the set of known mucus 

regions 1738 and the set of known CIN 2/3 tissue regions 1740. The difference between curves 
1728 and 1730 in Figure 90 is pronounced. Thus, in one embodiment, a term is included in the 
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[MU]sp CC metric based on the best ratio of wavelength found to maximize values of D in the 
discrimination equation, Equation 91 above. In one embodiment, this ratio is BB(456 
nm)/BB(542 nm). 

[0563] A scatter plot depicting discrimination between mucus regions and CIN 2/3 tissue 
5 regions for a set of known reference data may be obtained by comparing the ratio of arbitrated 
broadband intensity, BB(456 nm)/BB(542 nm) to a threshold constant. Then, a graph of true 
positive ratio (TPR) 1752 versus false positive ratio (FPR) 1754 in the discrimination between 
mucus regions and CIN 2/3 tissue regions are obtained using a threshold constant. For example, 
a TPR of 30.4% and an FPR of 0.8% is obtained using a threshold constant of 1 .06. The ratio of 
10 false positives may be reduced by adjusting the threshold. 

[0564] In one embodiment, the discrimination analysis illustrated in Figures 88, 89, 90, and 91 
lead to the overall [MU] sp ec mask metric as follows: 

BB(456 nm)/BB(542 nm) < 1.06 OR 
BB(594 nm)/BB(610 nm) > 0.74. 
15 The metric above combines the sites identified by the pooled fluids mask, as indicated by the 
bottom line of the metric above, with the sites identified by the mucus mask, as indicated by the 
top line of the metric above. 

[0565] Figure 92A depicts a reference image 1758 of cervical tissue from a patient scan in 
which spectral data is used in arbitration and [MU] spec spectral masking. Figure 92B is an 

20 obgram 1770 of the interrogation points (regions) of the tissue sample depicted in Figure 92 A 
and shows, in yellow (1768), the points that are "masked" by application of the [MU] spec metric 
above. White points (1 766) in Figure 92B indicate regions that are filtered out by the arbitration 
and initial low-signal mask of step 128, while pink points (1770) indicate regions remaining after 
application of both the arbitration/low-signal mask of step 128 as well as the |MU]spec spectral 

25 mask. 

Image masking 

[0566] The invention also comprises an image masking feature. Image masks identify data 
from one or more images obtained during patient examination that are potentially non- 
representative of regions of interest of the tissue sample. Potentially non-representative data 
30 includes data that are affected by the presence of an obstruction, such as blood, mucus, a 

speculum, pooled fluid, or foam, for example. In one embodiment, a reference image of an m- 
situ cervical tissue sample is obtained just prior to a spectral scan, and image masks are 
determined from the reference image to reveal where there may be an obstruction or other area 
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that is not of diagnostic interest. Areas that are not of diagnostic interest include regions affected 
by glare, regions of the os, vaginal wall tissue, or regions that are otherwise outside the area of 
interest of the tissue sample. These areas may then be "masked" from the analysis of spectral 
data obtained from tissue regions that coincide with the obstruction, for example. The image 
masks are combined with each other and/or with the spectral masks, as shown in block 1422 of 
Figure 73 and as shown in Figure 74. The resultant masks include "hard" masks and "soft" 
masks, described in more detail herein. Hard masks result in a characterization (or diagnosis) of 
"Indeterminate" at affected regions, while soft masking provides a means of weighting spectral 
data according to the likelihood that the data is representative of clear, unobstructed tissue in a 
region of interest. 

[0567] In one embodiment, image masks are combined and applied as indicated in the block 
diagram 1438 of Figure 74, in order to account for the identification of image-masked 
interrogation points in the tissue-class/state-of-health classification step 1432. Steps 1448, 1450, 
1452, 1454, 1456, 1458, 1460, 1462, and 1464 in Figure 74 depict the determination of image 
masks from the image data obtained around the time of the patient spectral scan. These image 
masks are discussed in more detail below. 

[0568] Figure 93 depicts image masks 1782, 1784, 1786 determined from a reference image of 
a tissue sample and conceptually shows how the image masks are combined with respect to each 
interrogation point (region) 1790 of the tissue sample, according to one embodiment. Generally, 
for a given interrogation point 1790 in the scan pattern 1788, the system determines whether any 
of the features detected by the image masks, such as the os image mask 1784 and the blood 
image mask 1786, intersects that interrogation point (region) 1790. For certain image masks, a 
percent coverage is determined for regions they intersect. For some image masks, if any of the 
mask intersects a region, the region is flagged as "masked". 

[0569] In one embodiment, a backend process determines the coverage of one or more masks 
for each interrogation point of the scanning pattern. Given a known correspondence between 
image pixels and interrogation points, a given point is assigned a percentage coverage value for a 
feature determined by a given image mask, such as blood detected by the Bloody image mask 
1458 in Figure 74. The percentage coverage value corresponds to the number of pixels for the 
given interrogation point coinciding with the selected image mask feature, divided by the total 
number of pixels for that interrogation point. For example, if the blood mask for a given 
interrogation point coincides with 12 out of 283 pixels that cover the point, then the percentage 
coverage for that interrogation point is 12/283, or 4.2%. 
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[0570] Steps 1468, 1470, 1472, and 1474 in Figure 74 demonstrate how the image masks are 
combined in one embodiment, and steps 1466, 1476, 1424, 1478, 1480, 1424, 1426, 1428, and 
1430 in Figure 74 demonstrate how the combined masks are applied with respect to the tissue- 
class/state-of-health classifications at the spectral interrogation points, in one embodiment. 

5 These steps are discussed in more detail herein. 

[0571] The image masks in Figure 74 are determined using image processing methods. These 
methods include color representation, spatial filtering, image thresholding, morphological 
processing, histogram processing, and component labeling methods, for example. 
[0572] In one embodiment, images are obtained in 24-bit RGB format. There are a number of 

10 ways to quantify image intensity and other image characteristics at each pixel. Most of the 
image masks in Figure 74 use values of luminance (grayscale intensity) at each pixel. In one 
embodiment, luminance, Y, at a given pixel is defined as follows: 

Y = 0.299R + 0.587G + 0.1 14B (92) 
where Y is expressed in terms of red (R), green (G), and blue (B) intensities; and where R, G, 

15 and B range from 0 to 255 for a 24-bit RGB image. Some of the image masks in Figure 74 use 
one or more of the following quantities: 

redness - - — + — — — (93) 
R + G R+B 

G-R G-B ,a A , 

greenness = + (94) 

G + R G + B 

blueness = - - + — — — (95) 
B+R B + G 

20 where R, G, and B are as defined above. 

[0573] Determination of the image masks in Figure 74 includes the use of one-dimensional (1 - 
D) and two-dimensional (2-D) filters. The types of filters used includes low-pass, smoothing 
filters and gradient, edge detection filters. The 1-D filters generally range in size from 3 to 21 
pixels and the 2-D filters generally range from 3x3tol5x35 pixels, although other filter sizes 

25 may be used. In one embodiment, box car filters are the preferred type of low-pass (smoothing) 
filters. Box car filters replace the value at the center of the filter support with an equally- 
weighted average of all pixels within the filter support. In one embodiment, the preferred types 
of gradient filters are Sobel and Laplacian of Gaussian filters. 

[0574] In one embodiment, the image masks in Figure 74 are determined using image 
30 thresholding, a subclass of image segmentation in which the image is divided into two segments. 
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The criterion for assigning a pixel to one of the two segments is whether its value is less than, 
larger than, or equal to a prescribed threshold value. A binary image may be obtained by 
marking pixels having values less than the threshold with zeros and the remaining pixels with 
ones. Some image masks are determined using multiple thresholding and/or dynamic 
5 thresholding, where the threshold for each pixel or group of pixels is computed dynamically 
from image statistics, for example. 

[0575] In one embodiment, the determination of the image masks in Figure 74 includes binary 
morphological processing. Binary morphological processing is performed on a binarized 
(thresholded) image to smooth object boundaries, change the size of objects, fill holes within 

10 objects, remove small objects, and/or separate nearby objects. Morphological operators used 
herein include dilation, erosion, opening, and closing. An operator may be defined by (1) a 
binary mask or structuring element, (2) the mask origin, and (3) a mathematical operation that 
defines the value of the origin of the mask. In one embodiment, a 3 x 3 square structuring 
element is used, and is generally preferred unless otherwise specified. 

15 [0576] In one embodiment, dilation increases the size of a binary object by half the size of the 
operator mask/structuring element. Erosion is the inverse of dilation and decreases the size of a 
binary object. For example, an erosion of a binary object is equivalent to the dilation of the 
background (non-objects). Opening is an erosion followed by a dilation, and closing is a dilation 
followed by an erosion. As used herein, dil(Img, n) denotes performing n dilation steps on 

20 image Img with a 3 x 3 square structuring element, and erod(Img, n) denotes performing n 
erosion steps on image Img with a 3 x 3 square structuring element. 

[0577] In one embodiment, the determination of the image masks in Figure 74 includes the use 
of histograms. Here, a histogram relates intervals of pixel luminance values (or other 
quantification) to the number of pixels that fall within those intervals. In one embodiment, 
25 histogram processing includes smoothing a histogram using a 1-D low-pass filter, detecting one 
or more peaks and/or valleys (maxima and minima), and/or computing thresholds based on the 
peaks and/or valleys. 

[0578] In one embodiment, the determination of the image masks in Figure 74 includes 
component labeling. Component labeling is used to join neighboring pixels into connected 
30 regions that comprise the components (objects) in an image. Extracting and labeling of various 
disjoint and connected components (objects) in an image allows separate analysis for each 
object. 
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[0579] In component labeling of a binary image using 8-connectivity 5 a connected components 
labeling operator scans the image by moving along the row until coming to a pixel p with a value 
V=l, then the operator examines the four neighbors of p that have already been encountered in 
the scan. For example, the four neighbors of p are (1) the pixel to the left of p, (2) the pixel 
5 directly above p, and (3,4) the two pixels in the row above pixel p that are diagonal to pixel p. 
Based on this information, p is labeled as follows: 

• If all four neighbors have V=0, assign a new label to p, ELSE 

• If only one neighbor has V=0, assign its label to p, ELSE 

• If one or more neighbors have a value of 1 , assign one of the labels 
10 to p and note the equivalences. 

After completing the scan, the equivalent label pairs are sorted into equivalence classes and a 
unique label is assigned to each class. A second scan is made through the image, and each label 
is replaced by the label assigned to its equivalence class. Component labeling of a binary image 
with 4-connectivity may be performed similarly. 

15 [0580] In one embodiment, an image mask is determined using data from a representative 
image of a tissue sample obtained near to the time of a spectral scan of the tissue (just before, 
during, and/or just after the spectral scan). In one embodiment, the representative image is 
obtained within about 30 seconds of the beginning or ending of the spectral scan; in another 
embodiment, the representative image is obtained within about 1 minute of the beginning or 

20 ending of the spectral scan; and in another embodiment, the representative image is obtained 
within about 2 minutes of the beginning or ending of the spectral scan. Other ranges of time in 
relation to the spectral scan are possible. In one embodiment, there is only one reference image 
from which all the image masks are determined. 

Glargvid 

25 [0581] Step 1462 in Figure 74 depicts the determination of a glare mask, Glares for an image 
of a tissue sample. Glares indicates regions of glare in a tissue image. Glare^a is also used in 
the computation of other image masks. Figure 94 A depicts an exemplary image 1794 of cervical 
tissue used to determine a corresponding glare image mask, Glare^. Figure 94B represents a 
binary glare image mask, Glare V id, 1796 corresponding to the tissue image 1794 in Figure 94 A. 

30 [0582] The white specks of glare in the tissue image 1794 in Figure 94A are identified by the 
image mask 1796. The image mask is determined using an adaptive thresholding image 
processing procedure. Different thresholds are applied in different areas of the image, since the 
amount of illumination may vary over the image, and a threshold luminance indicative of glare 
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in one area of the image may not indicate glare in another, lighter area of the image. In one 
embodiment, for example, an image of a tissue sample is divided into a 4 by 4 grid of equally- 
sized, non-overlapping blocks. A suitable glare threshold is computed for each block, and the 
subimage within that block is binarized with the computed threshold to yield a portion of the 
5 output glare segmentation mask, Glare vid . Each block computation is independent, and blocks 
are serially processed until the complete binary glare mask, Glarevid, is completely calculated. 
For each block, multiple thresholds based on luminance value and/or histogram shape are 
computed and are used to detect and process bimodal distributions. 

[0583] Figure 95 is a block diagram depicting steps in a method of determining a glare image 
10 mask, Glarevid, for an image of cervical tissue. Step 1802 in Figure 95 indicates dividing an 

image into a 4x4 grid of cells (blocks) 1 804 and computing a histogram for each cell that is then 
used to determine thresholds 1806 applicable to that block. Each histogram correlates intervals 
of luminance values, Y, (Y ranging from 0 to 255) to the number of pixels in the cell (subimage) 
having luminance values within those intervals. 
15 [0584] Step 1 806 in Figure 95 indicates determining thresholds applicable to a given cell of the 
image. For example, Figure 96 shows a histogram 1842 for one cell of an exemplary image. 
Curve 1 848 indicates a raw histogram plot for the cell (subimage), and curve 1 850 indicates the 
curve after 1-D filtering using a 21 -point box car filter. Quantities 1 840 related to thresholding 
that are calculated from each histogram 1842 include T pk (peak), T vy (valley), T !p , T s , T d05 and 
20 T 90 , all of which are described below. The exemplary histogram 1842 in Figure 96 shows bars 
indicating values of T pk (1852), T vy (1854), T, p (1856), T s (1858), T do (1860), and T 90 (1862) for 
the cell histogram curve. The heavy dashed line (1854) indicates the final threshold chosen for 
the cell according to the method of Figure 95. 

[0585] The following describes the steps of the method 1800 shown in Figure 95, according to 
25 one embodiment. 

[0586] The method 1800 in Figure 95 comprises calculating intended thresholds in step 1806. 
Four thresholds are computed to decide whether the block (cell) contains glare: 

1 . Ts = mean + 3 * std where mean is the average intensity of the block and std 

its standard deviation. 

30 2. Tip = last peak of smoothed histogram. Smoothing is performed using a width 

5 maximum order statistic filter. 
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3 . Tdo = Lmax + 2 (Ldo - Lmax) where Lmax is the index (gray level) at 
which the 21 -point boxcar filtered histogram, sHist, reaches it maximum value 
sHistMax, and Ldo is the first point after Lmax at which the filtered histogram 
value falls below 0.1 * sHistMax. 

4. T90 is defined so that 90% of the graylevels greater than 210 are greater than 
T90. 

[0587] Next, the method 1 800 in Figure 95 includes a block (cell) glare detector in step 1810. 
The block (cell) glare detector assesses whether glare is present in the block and selects the next 
block if no glare is detected. The block is assumed to have no glare if the following condition is 
met: 

((Tip < Ts) AND (Ts < T90)) OR 

((Tip < Tdo) AND (Tdo < T90)) OR 

((Tip < Tdo) AND (Tip < Ts) AND (Tip < T90)) OR 

((Tip < 0.8 * T90) AND (no valid glare mode as described in the bimodal 

histogram detection section below)). 
[0588] Next, the method 1800 in Figure 95 comprises selecting a candidate threshold, Tc, in 
step 1812. A candidate threshold Tc is chosen based upon the values of the intermediate 
thresholds Ts, Tip, Tdo and T90 according to the following rules: 

1. if(Tlp<T90): 

a. if (Tdo < Tip 12): 

i. if (Ts < Tip): Tc = (Ts + Tip) / 2 

ii. else Tc = Tlp 

b. else Tc = min (Tdo, Tip) 

2. (Tip >= T90) High intensity glare 

a. if(Ts<=T90): 

i. if ((Ts <= 100) AND (Tdo <= 100)): Tc = max (Ts, Tdo) 

ii. else if ((Ts <=100) and (Tdo > 100): Tc - min (Tdo, Tip) 
in. else Tc = min (Ts, Tdo) 

b. else 

i. if(Tdo<100):Tc = T90 

ii. else Tc = min (Tdo , T90). 
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[0589] Next, the method 1800 in Figure 95 includes detecting a bimodal histogram in step 
1806. Step 1806 detects bimodal histograms that are likely to segment glare from non-glare and 
uses the 21 point boxcar filtered histogram sHist to determine Tvy after computing Tpk and 
Tcross, as described herein. To compute Tpk, sHist is searched backwards from the end until 
5 point Tpk where the value is greater than the mean and maximum of its 5 closest right and left 
neigbors and where Tpk is greater or equal to 10. Tcross is the point after Tpk (in backwards 
search) where the histogram value crosses over the value it has at Tpk. If the histogram is 
unimodal, Tpk is equal to Lmax, the graylevel where sHist attains its max value, and Tcross is 
0. Tvy is the minimum point on sHist between Tpk and Tcross if the following glare condition, 
10 called valid glare mode, is met: 

(Tpk > 175) AND (Tpk > Lmax) AND 
(sHist[tPk] < 0.6 * sHist[Lmax]) AND 
((Tpk - Tcross > 20) OR (Tpk > T90)) AND 
((Tpk > (mean + (1.5 * std))) OR (Tpk > T90)). 
15 [0590] Next, the method 1 800 in Figure 95 includes selecting a final threshold in steps 1814, 
1816, 1818, 1820, 1822, 1824, and 1826. The final threshold selected depends on whether the 
histogram is bimodal or unimodal. For a bimodal histogram with a valid glare mode, the final 
threshold T is Tvy if 175 < Tvy < Tc. In all other cases (i.e. for unimodal histograms with a 
candidate threshold Tc and for bimodal histograms with a valley threshold Tvy ouside the range 
20 175 to Tc), Tc is chosen as the final threshold unless it can be incremented until sHistfTc] < 
0.01 * ShistpLmax] or Tc > Tlim under the following two conditions. First, if a value L exists 
in the range [Tc,255] where sHist[L] > sHist[Tc], define Lmin to be the gray value where sHist 
reaches its minimum in the range [Tc,L]. Then, Tc should not be incremented beyond Lmin, 
and the limit threshold TLim = Lmin. If L < 150, then Tlim = 210. Secondly, if L does not 
25 exist, Tlim = 210. 

lEQIkd 

[0591] Step 1448 in Figure 74 depicts the determination of a general region-of-interest mask, 
[ROI]vid, for an image of a tissue sample. The general region-of-interest mask determines where 
there is tissue in an image, and removes the non-tissue background. [ROI] V id is also used in the 
30 computation of other image masks. Figure 97A depicts an exemplary image 1 894 of cervical 
tissue used to determine a corresponding region-of-interest mask, [ROI]vid, 1896 corresponding 
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to the tissue image 1894 in Figure 97A. The mask 1896 excludes the non-tissue pixels in image 
1894. 

[0592] The [ROq V id mask detects the general areas of the image indicative of tissue, and is 
determined by thresholding a pre-processed red channel image of the tissue and by performing 
5 additional processing steps to remove unwanted minor regions from the thresholded image, 
explained in more detail below. 

[0593] Figure 98 is a block diagram 1 900 depicting steps in a method of determining a region- 
of-interest image mask, [ROI]vid, for an image of cervical tissue. The following describes the 
steps of the method shown in Figure 98 (1900), according to one embodiment. 

10 [0594] The method 1900 includes pre-processing in step 1902. First, smooth the red channel 
image by twice applying a 5x5 box car filter. The filtered image is sRed. Next, compute a best 
dynamic threshold for sRed as follows. Create a foreground binary image of sRed using a 
threshold of 15. Create a glare mask binary image, glareMsk, using glare mask process Glare^ 
above. Create a valid cervix pixel image, validPix, by binary AND-ing foreground and 

15 glareMsk inverse. Binary erode validPix, evalidPix = erod (validPix, 3). In evalidPix, find 
the top row containing the first valid pixel, topR; find the bottom row containing the last valid 
pixel, botR; the middle row is expressed as midR = (topR + botR)/2; then, set all evalidPix 
pixels above midR to 0. Compute mean, mean, and standard deviation, stdDev, of sRed on the 
region defined by evalidPix. The best dynamic threshold is then T = max(l 0, min (mean -1.5* 

20 stdDev, 80)). Threshold sRed using T in step 1904. 

[0595] Next, the method 1 900 in Figure 98 includes thresholding sRed using T in step 1 904. 
Then, step 1906 is performing a binary component labeling using 4-way connectivity. Finally, 
step 1908 is computing the area of each object obtained in the previous step and selecting the 
largest object. Flood fill the background of the object selected in the previous step to fill holes. 

25 The result is the [ROI] v id mask. 

ISJlvM 

[0596] Step 1450 in Figure 74 depicts the determination of a smoke tube mask, [ST] V id, for an 
image of a tissue sample. The smoke tube mask determines whether the smoke tube portion of 
the speculum used in the procedure is showing in the image of the tissue sample. The smoke 
30 tube mask also identifies a portion of tissue lying over the smoke tube (which may also be 
referred to as "smoke tube" tissue) whose optical properties are thereby affected, possibly 
leading to erroneous tissue-class/state-of-health characterization. Figure 99A depicts an 
exemplary image 1932 of cervical tissue used to determine a corresponding smoke tube mask, 
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[ST]vid, 1934 shown in Figure 99B. The smoke tube mask is determined in part by isolating the 
two "prongs" holding the smoke tube tissue. The two prongs are visible in the image 1932 of 
Figure 99 A at reference numbers 1930 and 193 1 . In some images, the prongs are not visible. 
However, the smoke tube tissue in these images (without visible prongs) is generally either a 
5 blue or blue-green color with almost no red component; and the smoke tube in these images is 
identified (and removed from consideration) by the general region-of-interest image mask, 
[ROIjvid. 

[0597] Figure 100 is a block diagram 1938 depicting steps in a method of determining a smoke 
tube mask, [ST] V id> for an image of cervical tissue. Image 1944 is an exemplary input image for 
10 which a corresponding smoke tube mask 1960 is computed. Image 1944 shows a circle 1945 
used in steps 1954 and 1956 of the method in Figure 100. 

[0598] The following describes the steps of the method shown in Figure 100, according to one 
embodiment. 

[0599] The method 1938 in Figure 100 comprises step 1946, pre-processing the image. Pre- 
15 processing includes processing each RGB input channel with a 3x3 median filter followed by a 
3x3 boxcar filter to reduce noise. Step 1946 also includes calculating or retrieving the general 
ROI mask ROImsk ([ROIjvid, described above) and the glare mask glareMsk (Glares, 
described above), and computing the search image, srclmg, as follows. First, compute the 
redness image Rn. Set to zero all values in Rn that are oustide ROImsk. Autoscale the redness 
20 image to the [0,1] range. Then, compute srchlmg, which will be used at the final stages of the 
algorithm to compute a rough correlation to find the best circle location. Srchlmg is a linear 
combination of the redness and red images: srchlmg = (1 - A) * Rn + A * R. The linear 
weight factor A is in the range [0.2, 0.8]. Form validPix = ROImsk AND not(dil (glareMsk, 
3) . Compute mean, meanR, meanG, meanB of the RGB channels on the region defined by 
25 validPix. The weight A is initially computed as: A = max (0.5, min ((2 * meanR) / (meanG 
+ meanB), 1.5)). Remap the value of A into the range [0.2, .8] , A - 0.2 + (0.6 * (A - 0.5)). 
Srchlmg is computed using the A factor determined above. 

[0600] Next, the method 1938 in Figure 100 comprises a prong detector filter in step 1948. 
The prong detector is applied to the red image, R and to an enhanced red image, RE to produce 2 
30 different prong images that will be arbitrated later. First, calculate the red-enhanced image, RE 
= R + max(R - G, R - B). Next, set up the prong detector filter. The filter is designed to be 
sensitive to smoke-tube prongs and to reject glare, edges and other features. The filter is a 
rectangular 35 by 1 5 separable filter. The horizontal filter H is defined by H = [-1 .5 -1.5 -1 .5 - 
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1.5-1.500000 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 00 00 4.5-1.5-1.5-1.5-1.5]. The vertical 
filter V is a box car filter of length 15. Next, apply the prong filter to R and RE images yielding 
Rprong and Reprong. Clip filtered images to 0 and autoscale to the range [0, 1], Set the 
bottom half of each filtered image as well as the first 20 and the last 20 columns to 0 (there are 
5 no prongs in these sections of images). Then, find a maximum value for each of the first 125 
rows of the 2 filtered images. Find the constant Rfact and REfact for each filtered image. 
These constants are defined as the mean of the maxima of the first 125 rows divided by mean of 
the first 125 rows. If (Rfact > Refact) use Rprong as the prong search image, iProng, 
otherwise use REprong. 

10 [0601] Next, the method 1938 in Figure 100 comprises thresholding, component analysis, and 
prong selection in step 1950. Step 1950 is used to select prongs. First, threshold iProng image 
with a threshold of 0.2. Perform binary component labelling to obtain all regions (objects). 
Compute regions (objects) statistics, including area, centroid, and major and minor axis length. 
Filter prong regions (objects). Discard each region (object) that statisfies any of the following 

15 criteria: 

1. Region size < 300. 

2. iProng maximum on object < 0.4. 

3. Region does not extend above row 100. 

4. Minor axis length >= 30. 

20 5. Region does not overlap with ROImsk. 

6. Region centroid is below row 140. 

7. Centroid y-value > 40 and object thinness = (major axis length/minor axis 
length) <= 2. 

Choose as the main prong the brightest remaining region (i.e where the region maximum value 
25 is greater than the maxima from all other remaining regions). Filter all other prong regions 
based upon the distance from the main prong by calculating the distance from each region's 
centroid to the centroid of the main prong, and discarding the region if the intra-centroid 
distance > 160 or if the intra-centroid distance <1 10. 

[0602] Next, method 1938 in Figure 100 comprises validation of the selected prongs in step 
30 1952. For each retained prong object in step 1950, the following computations are peformed to 
validate the selected prongs. Define pad, the rough distance from an object to its perimeter. 
Here, pad is set to 8. Form images of the dimension of the bounding box of the object plus pad 
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pixels on each side (OBJ). Crop the original object of the prong search image lOrig, from the 
original unsmoothed red channel image, Rorig, and form the binarized image BWProng. 
Compute internal region, intReg = erod (dil (OBJ, 2), 1). Compute object perimeter region, 
perObj = dil ((dil (OBJ, 2) AND not (OBJ)), 2). Compute mean and standard deviation, 

5 mean and std, of the object on the interior region, intReg, and the mean, pmean, on the 

perimeter region perObj- Compute left/right bias by computing locations of the center points of 
the object top row and bottom row, drawing a line connecting those 2 points to divide the 
perimeter region, perObj, into 2 sections, calculating the mean value of iProng on each of the 
half perimeter sections, LperMean, RperMean, and using the means to compute left/right 

10 biases, LRBias using LRBias = max (LperMean/RperMean, RperMean/LperMean). 
Discard any objects where the following holds: (mean/pmean < L4) OR (std > 0.55) OR 
((LRBias :> 1.45) AND (mean/pmean < 1.48)). If more than 2 prong candidates are 
remaining, keep the one closest to the main prong. If no prong candidates are left, the smoke 
tube mask is empty. 

15 [0603] Next, method 1938 in Figure 100 comprises template searching using circles in step 
1954. Step 1954 is used to qualify regions as smoke tube candidates. First, construct a binary 
mask, validCervix, of valid cervix pixel locations: by Computing or retrieving blood mask, 
bloodMsk, (Blooded, described below); Computing or retrieving glare mask, glareMsk, 
(Glares, described above), then compute the bloodMsk using validCervix = ROImsk AND 

20 not(BWProng) AND not(dil (glareMsk, 3)) AND not(bloodMsk). Then, determine an x- 
coordinate value for the center, xCent, of the circle and radius, rad. For 2 prongs xCent is the 
half point between centroids of 2 prongs and rad is the half distance bewteen prongs + 5. For 1 
prong, choose a default rad of 85 and do a left-right search to decide wether the smoke tube is 
on the left or right. The x-coordinate values, xCent, for each of the 2 search circles is the x- 

25 coordinate of the prong centroid +/- rad. The y-coordinate, yCent, is the y-coordinate of the 
prong centroid. For each circle center (xCent, yCent), find all points within rad that are in 
validCervix and compute the regional mean from the redness image. Then, find all points 
outside rad that are in validCervix and compute the regional mean from the redness image. 
Compute the contrast as the ratio of inner mean redness to outer mean redness and select the 

30 circle with minimum contrast. Discard the previous circle if xCent is within rad I A from the 
left or right edge of the image, since it cannot be a smoke tube. Then, use the search image, 
srchlmg, to perform an up-down search on the y-coordinate, yCent, to determine the actual 
smoke tube location using the x-coordinate xCent computed named above in section 2. Repeat 
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the search with the redness image Rn if the results are unsatisfactory. A minimum and 
maximum value for yCent, yCentMin and yCentMax are chosen as follows: 

1 . yCentMin = -rad + yProngBot; where yProngBot is the mean of 
the bottom-most points of the prong(s), or the bottom-most point for a 
single prong. 

2. For two prongs, yCentMax = yProngBot - (.75 * rad) i.e. the circle 
cannot extend beyond l A rad below the bottom of the prongs. 

3. For one prong, yCentMax = min(yProngBot + rad /3, 150) i.e. the 
circle can go quite past the end of the prong, but not below the 150th 
row of the image. 

Three more points spaced (yCentMax - yCentMin)/4 apart are computed between yCentMax 
and yCentMin. The search algorithm uses a total ofyCent candidate points. For each yCent 
candidate, the inner/outer contrast for circles centered at (xCent, yCent) are computed using 
srchlmg as follows: 

1 . Find all points within rad that are in validCervix and 
compute the regional mean from srchlmg. 

2. Find all points outside rad that are in validCervix and 
compute the regional mean from srchlmg. 

3 . Compute the contrast as the ratio of the inner mean value of 
srchlmg to the outer mean value of srchlmg and select 
the circle with minimum contrast. 

Check to see that at least one of the 5 contrast numbers is less than 1. If not, break out of the 
loop and proceed no further with this search. If at least one circle has contrast less than 1 , 
choose the minimum and select a new set of five points centered around this one using the 
following steps: 

1 . If the top or bottom point was the minimum, choose that 
point, the one below/above it, and three points evenly 
spaced in between them. 

2. If one of the three central points was the minimum, choose 
that point with the ones immediately below and above it, 
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and two additional ones centered in the two spaces that 
divide those three. 

Using the new set of five points, go hack to the computation of the inner/outer contrast for 
circles using srchlmg, discussed herein above, and proceed in this way until the distance 

5 between the five points is less than 3 pixels. When the distance between the points is less than 2 
pixels, exit the loop and choose the yCent with the current minimum contrast number as the 
final value of yCent for the circle. The contrast for the final circle must be less than 0.92 in 
order for the algorithm to find a valid circle. If that is not the case, then the search algorithm is 
repeated with the pure redness image, Rn instead of srchlmg, which was a mixture of R and 

10 Rn. If the Rn search produces an acceptable result with contrast less than 0.92, then this value 
of yCent is used and we can proceed. Otherwise, there is no suitable circle and the 
segmentation mask will contain prongs but no circle. 

[0604] Finally, method 1938 in Figure 100 comprises producing the final smoke tube 
segmentation mask in step 1958. First, set the values of all pixels above the horizontal line 

15 inside the circle which is bisected by the center to 1 . This effectively casts a "shadow" straight 
upward from the bottom of the image, and creates the effect that the smoke tube is coming 
straight down from outside of the image. The shadowed circle and prong images are combined 
to yield the final segmentation mask. Clean up any stray non-prongs by performing a flood-fill 
of "on" valued regions with seeds in the first or thirtieth row of the image to select only objects 

20 that touch the first or thirtieth row of the image. 

QSyid 

[0605] Step 1460 in Figure 74 depicts the determination of an os image mask, Os v id, for an 
image of a tissue sample. The optical properties of the os region may differ from optical 
properties of the surrounding tissue. In the method 1438 of Figure 74, the os image mask is used 
25 in soft masking to penalize data from interrogation points that intersect or lie entirely within the 
os region. Figures 101 A depicts an exemplary image 1964 of cervical tissue used to determine a 
corresponding os image mask, Os vi d, 1968, shown in Figure 101B. 

[0606] The Os V id image mask is determined using a combination of thresholds from different 
color channels and using a binary component analysis scheme. An initial mask is formulated 
30 from a logical combination of masks computed from each color channel, R, G, B, and 

luminance, Y (equation 94). The four individual masks are computed using a thresholding 
method in which the threshold is set relative to the statistics of the colorplane values on the 
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image region-of-interest (ROI). A component analysis scheme uses the initial mask to detect an 
os candidate area (object), which is validated. 

[0607] Figure 102 is a block diagram 1988 depicting steps in a method of determining an os 
mask, Osyid, for an image of cervical tissue. Image 1990 is an exemplary input image for which 
5 a corresponding os mask 2004 is computed. The following describes the steps of the method 
1988 shown in Figure 102, according to one embodiment. 

[0608] The method 1988 in Figure 102 includes image preprocessing in step 1992. 
Preprocessing includes computing luminance Y from RGB components Y = 0.299 * R + 0.587 * 
G + 0. 1 14 * B; smoothing RGB channels using 2 iterations of a 3x3 box car filter; and 

10 computing a ROI mask, ROImsk, ([ROI]vid) using the method described herein above. Next, 
process the ROI mask by eroding ROImsk 14 times to obtain eROImsk = erod (ROImsk, 14). 
Compute annulus perimeter, annMsk: annMsk = dil ( (eROImsk AND not erod (eROImsk, 
1)), 4). This is a thick closed binary image which traces the edge of the ROI, useful in closing 
the boundary around any os which might extend to the background. Remove glare in ROImsk 

15 by logically AND-ing ROImsk with the complement of the glare mask (obtain as described 
above) to obtain centerROImsk. Then, compute a mean and standard deviation of each color 
channel (meanR, stdR, meanG, stdG, meanB, stdB, meanY, stdY) in the region 
specified by the centerROImsk. 

[0609] Next, the method 1988 in Figure 102 includes thresholding to produce an initial 
20 segmentation mask in step 1994. First, cut-off centerROImsk around the annulus: 

centerROImsk = centerROImsk AND not (annMsk). Next, form a binary mask for each of 
the RGBY channels that represents pixels that exist in centerROImsk and that satisfy the 
following conditions: 

1 . mskR = (R pixels such that R < (meanR - .0.40* stdR)); 

25 2. mskG = (G pixels such that G < (meanG - .0.65 * stdG)); 

3. mskB = (B pixels such that B < (meanB - .0.75 * stdB)); 

4. mskY = (Y pixels such that Y < (meanY - .0.75 * stdY)). 
The resulting "initial" segmentation mask, msk, is then defined by: 

msk = centerROImsk AND mskR AND mskG AND mskB AND mskY. 

30 [0610] Next, the method 1988 in Figure 102 includes performing a binary component analysis 
in step 1996. This step breaks up the segmentation mask into multiple objects. First, perform 
binary component labeling on segmentation msk. Remove all objects with size less than 125. 
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Break apart all objects with size greater than 10000. For each object greater than 10000 
(thisObjMsk), do the following: 

1 . Compute mean value meanR and meanY for the area selected by thisObjMsk 
in the red and luminance channels. 

2. Set a new threshold for red and Y as follows: 

a. redT = 0.90 * meanR 

b. lumT= meanY 

3. Break the object apart, or make it smaller to yield newObj, then complement 
thisObjMsk with the region that is not part of the newly broken-up region: 
newObj = thisObjMsk AND (R pixels such as R >= redT) AND (Y pixels 
such as Y >=IumT). 

thisObjMsk - thisOBjMsk AND (not(newObj). 

4. Keep track of the original large image mask (thisObjMsk) that produces the 
smaller objects in step c. Create a large object mask IgObMsk for each 
thisObjMsk that is set to on for each large object which was found. 

[0611] Next, the method 1988 in Figure 1 02 includes performing dilation, binary component 
analysis, and candidate selection in step 1998. Step 1998 is performed to find os candidates 
from the multiple binary objects produced in step 1996. First, dilate segMsk produced in step 
1996 twice to obtain bMsk = dil (segMsk, 2). Perform a component labeling on bMsk. 
Discard objects of size less than 125 or greater than 23,000. For each remaining object, 
thisObjMsk, apply the following procedure to select candidates: 

1 . Compute mean, intMeanR, intMeanY, and standard deviation, intStdR, 
intStdY for red and luminance channel pixel values in thisObjMsk. 

2. Dilate thisObjMsk 7 times to yield dThisObjMsk = dil (thisObjMsk, 
7). 

3 . Compute perimeter mask: 

a. thisObjPerim = dil ((thisObjMsk AND not(erod 
(dThisObjMsk ,1))), 3). 

4. Compute mean, perMeanR, perMeanY, and standard deviation, 
perStdR, perStdY, for red and luminance channel pixel values in 
thisObjPerim. 
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5 . Compute the following indicators: 

a. os brightness (osBright)= intMeanY/ perMeanY. 

b. Perimeter uniformity (perUnif)= perStdR / intStdR. 

6. An object is an os candidate if: 

5 ((osBright < 0.85) AND (perUnif < 1 .75)) OR 

((osBright < 0.7) AND (perUnif < 2.85) AND (part of object came 
from large object as recorded in IgObjMsk). 

[0612] Next, the method 1988 in Figure 102 includes performing candidate filtering and final 
selection in step 2000. The remaining os candidates are processed as follows. First, discard 
10 large non-os objects at the periphery of the cervix using the following algorithm: 

1 . Define a binary image with a centered circular area of radius 1 50. 

2. Discard the object if more than half of it is outside the circle and if 
perUnif > 0.9. This step is done by performing a logical AND of the 
object with the circular mask, counting pixels and comparing to the 

15 original size of object. 

If the number of remaining objects is greater than 1, perform the following loop for each object: 

1 . Compute the centroid of the object, and compute the distance to the image 
center 

2. Exit if either: 

20 a. The distance to the center is less than 100 for all objects. 

b. No object lies within 100 pixels of center and a single object 
remains. 

Discard the object with the highest perUnif, and go back to step b. Finally, step 2002 of the 
method 1988 in Figure 102 determines the final os mask by twice eroding the final mask 
25 obtained in step 2000. 

Bloodvid 

[0613] Step 1458 in Figure 74 depicts the determination of a blood image mask, Blood V id, for 
an image of a tissue sample. The presence of blood may adversely affect the optical properties 
of the underlying tissue. In the method of Figure 74, the blood image mask is used in soft 
30 masking to penalize data from interrogation points that intersect or lie entirely within the blood 
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regions. Figure 103 A depicts an exemplary image 2008 of cervical tissue used to determine 
corresponding blood image mask, Blood vid , 2012, shown in Figure 103B. 
[0614] In one embodiment, the Bloodvid image mask is similar to the Os V id image mask in that 
it is determined using an initial mask formulated from a logical combination of masks computed 
from each color channel R, G, B and luminance, Y. However, the initial Bloodvid image mask is 
formed as a logical "OR" (not "AND") combination of the four different masks, each designed 
to capture blood with different color characteristics. Blood may be almost entirely red, in which 
case the Red channel is nearly saturated and the green and blue channels are nearly zero. In 
other cases, blood is almost completely black and devoid of color. In still other cases, there is a 
mix of color where the red channel dominates over green and blue. In one embodiment, the 
Bloodvid mask identifies relatively large regions of blood, not in scattered isolated pixels that 
may be blood. The logical OR allows combination of regions of different color characteristics 
into larger, more significant areas that represent blood. As with the Os V id mask, the Bloodvid 
mask is formulated by thresholding the initial mask and by performing component analysis. 
[0615] Figure 1 04 is a block diagram 2032 depicting steps in a method of determining a blood 
image mask, Bloody, for an image of cervical tissue. The following describes the steps of the 
method 2032 shown in Figure 104, according to one embodiment. 
[0616] The method 2032 in Figure 1 04 includes image preprocessing in step 2034. 
Peprocessing includes computing luminance Y from RGB components Y = 0.299 * R + 0.587 * 
G + 0.1 14 * B, and computing the ROI mask, ROImsk, ([ROI] v id) using the method described 
hereinabove. 

[0617] Next, the method 2032 in Figure 104 includes mask formation via thresholding in step 
2036. The following steps are used to produce an initial segmentation mask. First, four 
preliminary masks are generated to detect "likely" regions of blood, as follows: 

1 . To catch blood which is almost completely red, mskA 

mskA = ROImsk AND (B pixels such as B <15) AND (G pixels such as 
G<15) AND (R pixels such as R > 2*max(G,B)). 

2. To catch areas where red dominates over green and blue, mskB: 
mskB = ROImsk AND (R pixels such as R > G * 3) AND (R pixels 
such as R > B * 3). 

3 . To catch really dark, almost black blood, mskC: 

mskC = ROImsk AND (R, G, B pixels such as R + G + B < 60). 
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4. To catch dark, but not completely black blood, mskD : 

mskD = ROImsk AND (R, G, B pixels such as R + G + B < 150) 
AND (R pixels such as R <100) AND (R pixels such as R > max(G, B) * 
1.6). 

The final candidate segmentation mask, mskOrig, is computed as follows: mskOrig = 
mskA OR mskB OR mskC OR mskD. 

[0618] Next, the method 2032 in Figure 104 includes object selection using double 
thresholding in step 2040. The following steps are used to select regions that are blood candidate 
regions. First, a seed mask, seedMsk, is made by eroding mskOrig twice. Then, to connect 
neighboring pixels, dilate mskOrig once, then erode the result once to obtain clMskOrig. 
Finally, to eliminate spurious pixels and regions that are not connected to larger features, 
compute mask, msk, by performing a flood fill of "on" valued regions of clMskOrig with seeds 
in seedMsk. 

[0619] Next, the method 2032 in Figure 104 includes binary component analysis and object 
filtering in step 2042. Binary component labeling is performed on msk to select blood regions. 
For each labeled object the following steps are performed: 

1 . The Object mask is set to 0. Upon validation, the object mask is turned ON. 

2. An interior object is found by shrinking it once (1 erosion step) unless it 
disappears, in which case the algorithm reverts to the original object prior to 
erosion. 

3. Dilate the object OBJ 5 times, compute its perimeter and dilate the perimeter 
5 times: 

ObjPer = dil ((OBJ AND not(erod (dil (OB J,5), 1))), 3). 

4. For both the interior and perimeter objects, the mean and standard deviation is 
found for the Red, Green, and Blue color-planes within the objects. The 
interior and perimeter mean luminance is found as the average of the Red, 
Green and Blue means. 

5 . Two indicators are calculated which will help in the decision step: 

a. DarkBloodlndicator = (Perimeter Red mean) / (Interior Red mean). 
This number is high for dark or black blood because there is more red in 
the perimeter than in the interior. 
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b. BrightBloodlndicator = 

((Perimeter Green Mean + Perimeter Blue Bean) / Perimeter Red Mean) / 
((Interior Green Mean + Interior Blue Bean) / Interior Red Mean) . 
This number is large when the interior region has a much higher red 
content than green and blue as compared to the perimeter. 

6. If the following three conditions are met, the region is considered to be a 
"noisy" feature which is most likely near the edge of the cervix. This 
determination affects the decision rules to follow: 

a. Interior mean Red < 40 

b. (Interior standard deviation of Red > Interior mean Red) OR 
(Interior standard deviation of Green > Interior mean Green) OR 
(Interior standard deviation of Blue > Interior mean Blue) 

c. DarkBloodlndicator < 5. 

7. The decision rules: If any of the following three rules are satisfied, then this 
object is Blood. Otherwise it is not. 

a. DarkBloodlndicator > 2.5 AND not "noisy"; 

b. BrightBloodlndicator > 2.25 AND not "noisy"; 

c. BrightBloodlndicator > 2.25 AND DarkBloodlndicator > 2.5 (in 

this case it doesn't matter if it's a "noisy"). 

8. If the object is blood, it is turned ON in the final segmentation mask. 

[0620] Finally, the method 2032 in Figure 1 04 includes determining the final blood mask in 
step 2044. Step 2044 includes performing a flood-fill of all objects in which the seed objects 
were found to be blood. This yields the final blood segmentation . 

MUCUSyjd 

[0621] Step 1464 in Figure 74 depicts the determination of a mucus image mask, Mucus V id, for 
an image of a tissue sample. The presence of mucus may affect the optical properties of the 
underlying tissue, possibly causing the tissue-class/state-of-health characterization in those 
regions to be erroneous. In the method 1438 of Figure 74, the mucus mask is used in soft 
masking to penalize data from interrogation points that intersect or lie entirely within the mucus 
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regions. Figure 105 A depicts an exemplary image 2064 of cervical tissue used to determine a 

corresponding mucus image mask, Mucus V id, 2068 shown in Figure 105B. 

[0622] In one embodiment, the Mucus V i<i image mask is a modified blood image mask, tuned to 

search for greenish or bright bluish objects. Figure 106 is a block diagram 2072 depicting steps 

in a method of detennining a mucus mask, Mucusvw, for an image of cervical tissue. The 

following describes steps of the method 2072 shown in Figure 106, according to one 

embodiment. 

[0623] The method 2072 in Figure 1 06 includes preprocessing in step 2074. Preprocessing 
includes processing each RGB input channel with a 3x3 median filter followed by a 3x3 boxcar 
filter to reduce noise. Then, calculate or retrieve the following masks: 

1 . Glare mask (Glares): dilate glare mask once to yield glareMsk 

2. ROI mask ([ROI] vi d): ROImsk 

3. Blood mask (Bloody): bloodMsk 

4. os mask (Osvtd): osMsk 

Compute a valid cervix pixels mask, validCervix, by AND-ing the ROImsk with the 
complement of the other masks as follows: validCervix - ROImsk AND not(glareMsk) 
AND not(bloodMsk) AND not(osMsk). 

[0624] Next, the method 2072 in Figure 106 includes mask formation via thresholding and 
morphological processing in step 2076. The following steps are used to produce an initial mucus 
segmentation mask. First, calculate the means, meanR, meanG and meanB, for the RGB 
channels on the validCervix region. Compute the difference, RGgap between the red and 
green mean: RGgap = meanR - meanG. Create a binary mask, mskOrig, according to the 
following rule: mskOrig = ROImsk AND (R.G.B pixels such as ((2*G - R - B) >= (1 0 - 
RGgap/3))). This rule selects regions where green is somewhat higher than either red or blue 
relative to the gap. Finally, process the binary mask with an opening morphological operator to 
obtain opWlsk, as follows: 

1 . Perform two erosions with a 3-by-3 disk structuring element. 

2. Perform one dilation with a 3-by-3 square stmcturing element. 

3. Perform one dilation with a 3-by-3 disk structuring element. 

[0625] Next, the method 2072 in Figure 106 includes object selection using double 
thresholding in step 2080. The followings steps are used to select objects from the initial 
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segmentation mask by computation of seed points. First, a seed image, seedMsk, is computed 
by eroding opMsk 3 times. Then, opMsk is dilated twice then eroded once. Objects in 
opMsk are selected using seedMsk. For example, object I is selected at points where opMsk 
and seedMsk intersect, then selMsk is defined as the resulting object selection mask. 

[0626] Then, the method 2072 in Figure 106 includes binary component analysis and object 
filtering in step 2082. The following steps are applied to all objects selected in step 2080: 

1 . Perform binary component labelling on all selected objects in selMsk. 

2. Set final segmentation mask to all 0 5 s. 

3 . Compute area for each object in selMsk and discard any object with an area 
less than 1000 pixels, update selMsk by removing discarded objects 

4. Process all remaining objects in selMsk as follows (steps 2084, 2086): 

' a. Compute mean and standard deviations of the red, green and blue 

smoothed images, meanR, meanG, meanB, stdR, stdG, stdB, for 

each object. 

b. Compute the object perimeter for each object: 

i. Binary object, binObj, is dilated 15 times dilBinObj = 
dil(binObj, 15). 

ii. Object perimeter is computed and then dilated: 

perBinObj = dil((dilBinObj AND not(erod (dilBinObj, 1)), 4). 

c. Compute mean and standard deviations on each color channel, pmeanR, 
pmeanG, pmeanB, pstdR, pstdG, pstdB for each region's 
perimeter. 

d. Compute six decision rule indicators: 

i. Mucus Indicator 1: muclndl = (meanG/pmeanG) * 
(pmeanR/meanR) 

ii. Mucus Indicator 2: 

muclnd2 = (meanG/pmeanG) * (pmeanR/meanR) * 
(meanB/pmeanB) 



PCT/US2003/021347 

-167- 

Green bright indicator: gBrightlnd = 3 * meanG - meanR - 
meanB 

Local variation quotient: 

locVarQuo = (stdR +stdG +stdB)/ (psdfR +pstdG + pstdB) 

Target laser Indicator: 

targLasInd = (meanG * (pmeanR + pmeanB))/(pmeanG * 
(meanR + meanB)) 

Blue not too bright indicator: bNotBrightlnd 

if ((meanB > meanR) AND (meanB > meanG)) 

bNotBrightlnd = (meanG - meanR)/(2 * abs(meanB - 
meanG) 

else 

bNotBrightlnd = 10. 

e. Object is not mucus object if the following holds: 

(muclndl < 1.25) OR (muclnd2 < 1.5) OR (gBrightlnd < 100) OR 
(bNotBrighlnd < 1) OR 

(targLasInd > 1.5) OR (locVarQuo > 1.75). 

f. If the object is selected as a mucus object, it is added to the final mucus mask. 

[0627] Step 1452 in Figure 74 depicts the determination of a speculum image mask, [SP] V id, 
for an image of a tissue sample. [SP] V id is used in hard masking in the tissue characterization 
method 1438 of Figure 74. Here, data from the interrogation points that intersect the speculum 
are removed from consideration in the tissue-class/state-of-health classification steps. Figure 
107A depicts an exemplary image, 2098, of cervical tissue used to determine the corresponding 
speculum image mask, [SP] v id, 2100, shown in Figure 107B. 

[0628] In one embodiment, the speculum image mask is determined by finding circles near the 
bottom of the image. Projections of a number of different types of speculums resemble circles of 
different radii. In one embodiment, two types of circle searches are used: an outer bottom 
search and an inner bottom search. The outer bottom search finds points near the bottom edge of 
the general region-of~interest and infers circles from these points. If multiple circles result, they 
are evaluated to find the one that best models the curvature at the bottom of the region-of- 
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interest. A circle that models this curvature well enough is used to form the speculum 
segmentation mask, [SP] V jd. 

[0629] If the outer bottom search does not produce a circle that models the ROI curvature well 
enough, then another search is performed to find a circle that models the curvature of a speculum 
within the ROI. This is the inner bottom search, and may be necessary where there is significant 
reflection of light from the speculum. In the inner bottom search, a set of angular projections is 
formed based on a best guess of the center of curvature from the outer circle search. The 
projections are then analyzed to find a significant intensity trough near the end of the projections 
that agrees with the general expected location of a speculum at the bottom of the image. The 
projection analysis provides new points with which to model circles, and the resulting circles are 
evaluated using the image data to detect the presence of a speculum. 
[0630] Figure 108 is a block diagram 21 12 depicting steps in a method of determining a 
speculum image mask, [SP] V id, for an image of cervical tissue. The following describes the steps 
of the method 21 12 shown in Figure 108, according to one embodiment. 
[0631] The method 21 12 in Figure 108 includes image preprocessing in steps 21 14 and 2116. 
The following steps are used to preprocess the image used in speculum mask computation. First, 
remove glare from the RGB image by performing the following: 

1. Calculate or retrieve glare mask, glareMsk (Glare V id). 

2. Dilate glareMsk 4 times to obtain dilGlareMsk. 

3. Filter the RGB values using dilGlareMsk to perform run-length 
boundary interpolation as follows: 

a. Raster scan each row of dilGlareMsk to find all beginnings and 
ends of pixel runs. 

b. For each pixel P(x,y) in a given run specified by beginning point 
P(xb, y) and end point P(xe,y) in the intensity image, replace 
P(x,y) by half the linearly interpolated value at P(x,y) from P(xb,y) 
and P(xe,y). 

c. Raster scan each column of dilGlareMsk to find all beginnings 
and ends of pixel runs. 

d. For each pixel P(x,y) in a given run specified by beginning point 
P(x, yb) and end point P(x,ye) in the intensity image, add to P(x,y) 
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half the linearly interpolated value at P(x,y) from P(x,yb) and 
P(x,ye). 

Then, smooth the RGB channels by filtering twice with a 5x5 box car filter. Finally, calculate or 
retrieve the ROI mask, ROImsk ([ROI]vi<i). Next, the method 21 12 in Figure 108 includes outer 
bottom circle detection in step 2120. The outer bottom circle detection is designed to find the 
best circular segmentation matching the bottom of ROImsk. Step 2120 includes the following: 

1 . Where width specifies the image width, compute the x-location of 7 columns 
(defined by none the intervals C\ = i • width/10, where i = 1 to 9). The x- 
locations are used to determine y-values. The resultant (x,y) pairs are used to 
find different candidate circles. 

2. Four candidate circles - narrow, wide, left, and right are calculated from 
the x values using the following matrix: 

a. Narrow circle: C3 C5 C7 

b. Wide circle: C2C5C8 

c. Left circle: C2C4C6 

d. Right circle: C4C6C8 

3 . The y-values are determined by scanning the y -axis, at a given x-position, 
starting at the bottom, until an "on" pixel is encountered in ROImsk. The 
same process is performed for 5 adjacent pixels to the right and left of the 
given x-position. The resulting 1 1 y-values are averaged to obtain the y-value 
used for calculating circles at the given x-position. 

4. For each set of x values defined by the rows in the matrix above, the y values 
are computed as described above, and the resulting three pairs of coordinates 
are used to determine a unique circle intersecting these 3 points. 

5 . A candidate circle is retained if: 

a. Radius R > 250 AND 

b. R < 700 AND 

c. The circle's center lies at a y value less than 240 (half the image height). 
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[0632] Next, the method 21 12 in Figure 108 includes validation of the outer circle in step 
2122. The following steps are used to validate the outer circle: 

1 . If circles remain after the previous pruning, perform the following evaluation 
procedure: 

a. Compute candidate circle center, draw perimeter at given radius and 
construct 2 offset regions from the drawn perimeter. 

b. The average intensity values, meanTop and meanBot, are calculated 
for each region on the red image. 

c. The BotTopRatio is calculated as the ratio of meanTop to meanBot 

i. The top region is centered 1 0 pixels above the perimeter of the 
circle, and is 7 pixels in height. For example, for a given (x0,y0) 
point on the perimeter, the vertical region at x0 comprises the 
pixels in the range (xO, y0+10) to (x0, yO+10-7). 

ii. Similarly, the bottom region is centered 1 0 pixels below the 
perimeter of the circle, and is 7 pixels in height. 

d. The average intensity values, meanTop and meanBot, are calculated 
for each region on the red image. 

e. The BotTopRatio is calculated as the ratio of meanTop to meanBot. 

2. The circle with the best fit to the actual speculum should minimize this ratio. 
If there is more than one circle remaining, the circle with minimum 
BotTopRatio is chosen. 

3. If BotTopRatio > 0.55, the circle is rejected, and it is concluded that the 
outer bottom circle detection found no valid circle. 

If BotTopRatio < 0.55, the circle is kept as the initial result for the speculum segmentation. If 
the outer circle detection produces a circle with a strong enough representation of the speculum, 
then this is taken as the result and an inner speculum search is not done. Otherwise the inner 
speculum search is done. If no circle is found using the outer algorithm, perform the inner 
bottom speculum search. If the outer search finds a circle, look at the BotTopRatio to 
determine whether it qualifies: 

1 . If BotTopRatio < 0.275, take the outer circle as the final segmentation 
mask and stop. 
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2. If BotTopRatio >= 0.275, try the inner speculum search to see if it 
yields a satisfactory result. 

[0633] Next, the method 21 12 in Figure 108 includes inner bottom circle detection in step 
2126. The Inner bottom circle detection algorithm looks for circles within the ROI mask by 
calculating angular projections and looking for "valleys" in the projections to determine points 
that can be used to infer circles. The resulting circles are evaluated with a scheme similar to the 
one for outer bottom circle detection. Step 2126 includes the following: 

1 . Angular projection center point selection: 

a. If an outer circle was detected, use the center point of the outer circle. 

b. Else, use the point (n/2,1), where n is the width of the image. 

2. The inner speculum search is done on the red color-plane R and a redness- 
enhanced red image ERn. The search results from the two images R and 
ERn are evaluated as a group and the best result is taken from the entire set. 
The redness enhanced red image is given by ERn = (2 * R + Rn)/3, where 
Rn is the redness image defined in Equation 95. If no inner speculum is 
found from the redness enhanced red image, then the inner speculum search 
has determined that there is no identifiable inner speculum. The inner 
speculum search algorithm is described in the subsequent steps. 

3. Calculate angular projections as follows: 

a. Five x-values give the center of each projection as it crosses the bottom 
row of the image: [CO C2 C4 C6 C8]. 

b. From these x-values, the angle thetaCtr, the central angle for the 
projection, is computed. 

c. For each angle thetaCtr, a projection sweeping out 1 0 degrees (5 degrees 
on each side of thetaCtr) is calculated. 

d. For each 10 degree span, 50 equidistant line profiles (10/50 degrees) are 
used to calculate the projection. The profiles extend from the center point 
to the point where the line at each angle crosses the bottom row of the 
image. 

e. The 50 profiles are averaged to yield the projection for each of the angles 
thetaCtr. 
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f . Each projection profile is filtered with a 1 5 sample long boxcar moving 
window averager. 

4. Each projection is searched backward to find the first "peak" in the projection, 
then search backwards again until the valley beyond that peak is found, This 

5 valley usually occurs near the boundary between the speculum and the cervix. 

Not every projection will yield a good valley point V. The criteria for finding 
the valley V of a projection P are as follows: 

a. P (V) <= mean(P (V + k) for all k in [1:12] (12 samples after V); 

b. P (V) <= mean(P (V + k) for ail k in [-12:-1](12 samples before V); 
10 c. P(V)<=P(V+k) for all kin [-12:12]; 

d. P (V) < P (V +k)-4 for some k in [V:length(P)] (peak-valley is >= 4); 

e. For valley V, find the y coordinate value y v and check that y v > 300. 

5. After V is located, search backwards to find the point VMin where the first 
derivative of the projection is less than K * minSlope, where minSlope is 

15 the minimum slope between the valley V and the maximum of P(n) for n in 

[1 :V], and K is a constant parameter set to 0.3. VMin becomes the final point 
used for inferring circles from this projection. 

6. If the number of points to infer circles (calculated from the valleys as 
described above) is greater than 3, then as many circles as possible can be 

20 identified from these points and evaluated. The circles are chosen from the 

following matrix: 
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where the elements of the matrix correspond to the five projections computed 
above. If a specific projection j fails to yield an acceptable valley point, then 
all rows of the CirclelDX matrix which contain j are removed. 

7. All remaining rows in CirclelDX are used to select projections for inferring 
circles. The circles are calculated by first getting (x, y) coordinates for the 3 
points defined in the steps above, using the center of projection and the radius 
along the projection. A unique circle is fitted through the 3 points, unless 
points are collinear, and circle center (xCent, yCent) and radius rad are 
computed. 

[0634] Next, the method 21 12 in Figure 108 includes validation of the inner bottom circle in 
step 2128. The following steps are used to validate the inner bottom circle: 

1 . For each circle, the circle is discarded if any of the following conditions 
applies: 

a. rad < 250 (the circle is too small to be a speculum) 

b. yCent > (image height)/2 (center of circle in lower half of image or beyond). 

2. Each remaining circle is evaluated with the following technique: 

a. A temporary image is defined for identifying three different regions 
specific to the circle. It is an 8-bit image with the following values: 

i. 1 for the "inner" region, which is the region between the circle and 
another circle whose center is 12 pixels below the original one. 

ii. 2 for the "bottom" region, which is a 12 pixel wide circle drawn 
centered at 20 pixels below the original circle. 

iii. 3 for the "top" region, which is a 12 pixel wide circle drawn 
centered at 20 pixels above the original circle. 

iv. 0 for all other points in the image. 

b. Five sets of pixels are calculated on the temporary image. The average 
pixel value is calculated from the search image (Red or Redness enhanced 
Red) for each set of pixels: 

i. Top pixels, used to calculate AvgTop; 

ii. Bottom Pixels, used to calculate AvgBot; 

iii. Inner pixels, used to calculate Avg I n ; 
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iv. Outer pixels (top and bottom), used to calculate AvgOut; 

v. Inner-bottom pixels (inner and bottom), used to calculate 
AvglnBot 

c. Two ratios are calculated from these sets of pixels: 

v. InOutRatio = Avgln / AvgOut; 

vi. BotTopRatio = min([AvgBot / AvgTop, Avgln / AvgTop, 
AvglnBot/ AvgTop]). 

d. The InOutRatio gives an estimate of how closely the circle conforms to a 
low-intensity cervix-speculum boundary, and the BotTopRatio helps to 
evaluate how well the circle matches an intensity difference. 

e. To be a valid speculum representation, a circle should satisfy the 
following criterion: 

(InOutRatio < 0.70) OR (InOutRatio < 0.92 AND BotTopRatio < 

0.83 ). 

If no circles meet this criterion, then the algorithm detects NO inner 
speculum. 

f. The inner circle representing the speculum is the circle from step e that 
has the minimum value of InOutRatio. 

g. If there is a resulting circle that has passed the validation procedure, 
evaluate to verify it is not a false positive by comparing the mean 
luminance on two portions of the ROI, above the speculum and below the 
speculum. 

vii. Glare, blood and os are removed from ROI to obtain dROI, where 
dROI = ROI AND not(glareMsk) AND not(bloodMsk) AND 
not(osMsk). 

viii. Compute mean luminance, meanLTop, on dROI region above 
circle, 

ix. Compute mean luminance, meanLBot, on dROI region below 
circle. 
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x. If meanLBot > 0.8 * meanLTop and the bottom-most point on 
the inner circle is less than % of the image height, then the 
candidate is a false positive and is discarded. 
[0635] Finally, the method 21 12 in Figure 108 includes final determination of the specular 
5 segmentation mask in step 2128. The final segmentation mask is computed from the results of 
the inner and outer speculum searches. If the outer search produces a satisfactory result and no 
inner search is done, the final mask is the one computed by the outer speculum search. If the 
outer search produces a satisfactory result and an inner search is performed which also produces 
a result, the final segmentation mask is the logical OR of the inner and outer masks. If the outer 
10 search produces no result but the inner search produces a result, the final mask is the mask from 
the inner search. If neither search produces a result, the final segmentation is empty, indicating 
that the algorithm has determined that no speculum is present. 

lYWki 

[0636] Step 1454 in Figure 74 depicts the determination of a vaginal wall image mask, 
15 [VW]vid 5 for an image of a tissue sample. [VW] V id is used in hard-masking in the tissue 

characterization method 1438 of Figure 74. Figure 109 A depicts an exemplary image 2190 of 
cervical tissue used to determine the corresponding vaginal wall image mask, [VW] V id, 2194 
shown in Figure 109B. 

[0637] In one embodiment, the vaginal wall mask detects vaginal walls and cervical edges, 
20 including fornices and speculum blades. Here, the mask is determined using a filter shaped like 
a notch to emphasize the vaginal wall. This is similar to template matching in which the 
template is present along one dimension and the filter is constant along the other dimension. 
This achieves a projection-like averaging. 

[0638] After application of the filter in horizontal and vertical orientations, the resultant 
25 gradient images are thresholded and skeletonized. A heuristic graph searching method connects 
disconnected edges, and the edges are extended to the bounds of the image to form a full mask. 
Once the edges are extended, the edge lines are shadowed outward from the center of the image 
to form the final vaginal wall segmentation mask, [VW] v id- 

[0639] Figure 1 1 0 is a block diagram 22 1 8 depicting steps in a method of determining a 
30 vaginal wall image mask, [VW] V id> for an image of cervical tissue. The following describes the 
steps of the method 2218 shown in Figure 110, according to one embodiment. 
[0640] The method 221 8 in Figure 1 10 includes preprocessing in step 2220. First, calculate or 
retrieve the glare, glareMsk, ROI, ROlMsk, and os, osWlsk, segmentation masks. Calculate 



WO 2004/005895 



PCT/US2003/021347 



- 176 - 

the luminance L from the RGB signal using the formula: L = 0.299 * R + 0.587 * G + 0.1 14 * 
B. Dilate glareMsk 4 times to obtain dilGlareNIsk. Then, filter the RGB image using 
dilGlareMsk to perform run-length boundary interpolation as follows: 

1 . Raster scan each row of dilGlareMsk to find all beginnings and ends of 
5 pixel runs. 

2. For each pixel P(x,y) in a given run specified by beginning point P(xb, y) 
and end point P(xe,y) in the intensity image, replace P(x,y) by half the 
linearly interpolated value at P(x,y) from P(xb,y) and P(xe,y). 

3 . Raster scan each column of dilGlareMsk to find all beginnings and ends 
10 of pixel runs. 

4. For each pixel P(x,y) in a given run specified by beginning point P(x, yb) 
and end point P(x,ye) in the intensity image, add to P(x,y) half the linearly 
interpolated value at P(x,y) from P(x,yb) and P(x,ye). 

5 . Perform a 1 lxl 1 box car filter smoothing on dilGlareMsk regions only. 

15 Finally, smooth the filled RGB channels by filtering once with a 3x3 box car filter. 

[0641] Next, the method 221 8 in Figure 1 1 0 includes gradient image processing in steps 2222, 
and 2224. First, create a notch filter for detecting the vaginal wall. The filter of length 22 is 
defined by the following coefficients: [1111 2/3 1/3 0 -1/3 -2/3 -1 -1 -1 -1 -2/3 -1/3 0 1/3 
2/31111]. Then, normalize the filter: The average of the filter coefficients is subtracted from 

20 the filter in order to make it a zero-gain convolution kernel. Replicate rows 24 times to create a 
22 by 24 filter. Filter the luminance image L with the vaginal wall notch filter to produce the 
vertical gradient image vGradlmg. Filter the luminance image with the transpose of the notch 
filter to produce the horizontal gradient image hGradlmg. Clip gradient images to 0. Finally, 
perform the following thresholding and clean-up operations on each of the gradient images 

25 hGradlmg and vGradlmg: 

1 . Threshold the images at 975 to yield a binary object image, 

2. Perform a binary component labeling using 4-way connectivity. 

3. Compute regions statistics: area, centroid, major and minor axis length. 

4. Discard any obj ect whose size is less than 1 000 pixels. 

30 5. Discard any object which is within 80 pixels of distance from the center of 

the image. 
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6. Dynamically calculate the minimum allowable length, 

Min AllowedLength, for each object based upon the distance of its 
centroid (xCentroid, yCentroid) from the center of the image (Cx, Cy) 
defined by Cx = (image width)/2 and Cy = (image height)/2. Let x be the 
5 distance of the centroid to the center of the image, x - sqrt ( (xCentroid 

- Cx) 2 + (yCentroid - Cy) 2 ). 

MinAllowedLength scales the minimum allowed distance from 250 (at 
the image center) to 100 at the left or rightmost edge of the image and is 
defined by: 

io MinAllowedLength = 250 - (15* x /25). 

7. Discard any object with a major axis length less than 
MinAllowedLength. 

8. Discard any object that is more than 50% outside of the image's ROI. 

9. Discard any object that covers more than 5% of the os. 

1 5 [0642] Next, the method 22 1 8 in Figure 1 1 0 includes skeletonization in step 2226. The binary 
images resulting from step 2224 are processed with a skeletonization algorithm that 
approximates the medial axis transform. The skeletonization algorithm works for either 
horizontal or vertical edges. For vertical edges, each row is scanned from left to right. Each 
time the pixel values transition from OFF to ON, the index of the ON pixel is remembered. If 

20 the first pixel in the row is ON, this qualifies as a transition. When there is a transition from ON 
back to OFF, the index of the last ON pixel is averaged with the index from the previous step to 
give the center pixel in the ON region. If an ON region extends to the last pixel in the row, then 
this last pixel is treated as a transition point. All pixels between and including the first and last 
ON pixels are turned off except the center pixel. For horizontal edges, each column is scanned 

25 from top to bottom. The same steps described hereinabove are repeated for the columns instead 
of the rows. 

[0643] Next, the method 22 1 8 in Figure 1 1 0 includes edge linking and extension in steps 2226, 
and 2228. The skeletonizations are processed with a heuristic graph-searching method which 
connects slight gaps in the skeletonized images and extends the edges to the image boundary. 
30 The following images and parameters are used by the edge linking algorithm: 

• Horizontal and vertical skeletonized edge image, vSkellmg, hSkellmg 



WO 2004/005895 



PCT/US2003/021347 



-178- 

• Input label matrix, Lb I Mat This is found by labeling matrix output from the 
connected components analysis, where discarded regions have been removed 
from the label matrix by setting their pixel values back to 0. 

• Horizontal and vertical edge orientation, vEdgeOrient, hEdgeOrient 

• Skeletonized input label matrix, skLblMat This is a copy of LblMat where 
all the pixels which are OFF in the skeletonized image are set to 0 in 
skLblMat. 

• Gap = 16.0, the maximum allowable gap to fill in for a disconnected edge. 
The following are searching methods that are implemented. 

1 . Search for Edge Pixels: For both the horizontal and vertical edge images, 
the images are raster searched to locate edges within them. 

a. The vertical edge image, vSkellmg, is searched by row raster scanning to 
ensure that the first point in an edge is encountered. 

b. The horizontal edge image, hSkellmg, is searched by column raster 
scanning to ensure that the first point in an edge is encountered. 

c. When a point is encountered, the algorithm references SkLblMat to see if 
that point has a positive label, indicating that this edge has not yet been 
processed. If so, the edge connection and edge extension routines 
described in the steps below are executed starting from this point. 

2. Edge Connection. The edge connection routine starts from the point from 
which it is called. The routine keeps a list of the points encountered in the 
edge. The search is executed only for points with the same label in 
dilGlareMsk. 

a. Create Label matrix skLblMat as described above. 

b. Find second point: 

i. Starting from the first point, do a search in a rectangular region of 
size 2*( Gap +1.5) + 1 centered about the first point. 

ii. The second point will be the point which is ON in the edge image 
which is closest to the first point, and which is not already part of 
any other linked edge (must have same label value as the first 
point). 
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iii. Fill in the gap between the first point and the second point. The 
Gap filling algorithm is described below in step 3. 

iv. If this edge begins at a point "sufficiently close" (with respect to 
Gap) to another edge, set a flag to prevent extension of the 
beginning of this edge. 

v. If no second point is found, or if the second point is part of another 
edge which has already been linked, erase this edge in the output 
edge image (see Edge Erasing description below) and in 
skLblMat, stop processing this edge, and continue the loop to 
look for the next edge. 

c. Find the third point: 

i. Starting from the second point, do a search in a rectangular region 
of size 2*( Gap +1 .5) + 1 centered about the second point. 

ii. The third point will be the point which is ON in the edge image 
which is closest to the second point, and which is not already part 
of this or any other linked edge (must have same label value as the 
first point). 

iii. Fill in the gap between the second point and the third point. 

iv. If no third point is found, or if the third point is part of another 
edge which has already been linked, erase this edge in the output 
edge image, stop processing this edge, and continue the loop to 
look for the next edge. 

d. After three points in this edge are discovered, there is enough information 
to infer a search direction, and from here on out all searches in the Edge 
Connection are directional. Steps for computing the search location are 
listed below. 

e. Starting with the search for the fourth point, the following steps are 
iteratively performed until no further pixels in this edge can be found: 

i. The search direction: North (N), South (S), East (E), West (W), 
NorthEast (SE), NorthWest (NW), SouthEast (SE) or SouthWest 
(SW) is computed by the steps described below. 
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ii. Check the edge length, if it is greater than 2048, break out of the 
loop because this edge must have looped back upon itself. 

iii. Find the next point in the given search direction: If no further 
points were found, check to see if the edge length is less than 120. 

1 . If edge length < 120, erase edge and break out of this loop 
to continue the processing to find other edges (back to step 

1). 

2. If edge length >= 1 20, keep edge end break out of loop and 
continue with step f). 

iv. Fill in the gap between the current point and the new point. 

v. If the new point belongs to an edge which was already linked by 
this algorithm, do the following: 

1 . If the current edge is less than 40 pixels in length, erase this 
edge. Break out of the loop and continue searching for 
further edges (back to step 1). 

2. Otherwise, the edge will be kept, but a flag is set so that the 
end of this edge is not extended. Break out of the loop and 
continue with step f . 

vi. Increment the edge length so that the new point becomes the 
current point for the next iteration. 

vii. Continue with step i) to continue processing. 

f . At this point, a valid edge has been detected. This edge will then be 
extended in the both directions to the boundary of the image unless either 
edge (or both) is flagged for not extending. The edge extension steps are 
described below in step 5. 

g. Check to see if an extension passed through the center of the image 
(defined by a circle of radius 80 centered at the geometrical center of the 
image), 

i. If an extension did pass through the center of the image, erase this 
edge and all of its extensions. 
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ii. Otherwise, relabel this edge in the Label matrix to have value -1 , 
and draw the extensions on the output edge image, simultaneously 
labeling the corresponding pixels in the Label matrix with value 
-2. 

5 3. Gap Filling method: 

a. Check to see if there is no gap, i.e. if the edge is already connected. 
Where (xl,yl) and (x2,y2) are the new point and the current point, if 
abs(xl-x2)<2 and abs(yl-y2)<2, then there is no gap to fill, and the Gap 
Filling processing stops. 

10 b. Remove the <c New pixel" from the edge vectors so that it can be replaced 

with a set of filled-in pixels, 
c. Check for special cases where xl=x2 or yl=y2. In either of those two 

cases, the Gap Filling is accomplished by simply turning on every pixel 

which lies between the two pixels in the output Edge image. 
15 d. For the case where xl is not equal to x2 and yl not equal to y2, a diagonal 

line needs to be drawn to fill the gap. 

i. This is done first by computing an equation for the line which 
connects the two points. 

ii. If the slope is greater than 1, iterate from y=yl to y2, and compute 
20 the x value for each y value. For each (x,y) turn on the 

corresponding pixel in the output Edge image and in skLabMat. 

iii. If the slope is less than 1 , iterate from x=xl to x2, and compute the 
y value for each x value. For each (x,y) turn on the corresponding 
pixel in the output Edge image and in skLabMat. 

25 e. Finally, all of the new pixels are added to the edge vectors in order from 

the current pixel to the new one. The corresponding pixels in skLabMat 
are set to the label value -2. 

4. Computing Search Direction: 

a. Two pixel locations are used to infer a search direction. 

30 i. The first point is the geometric average of the two most current 

pixels in the edge. 
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• ii. If there are less than 6 pixels in the edge, the second point is the 
average of the first and second pixels in the edge. 

iii. If there are more than 6 pixels in the edge, the second point is the 
average of the fifth and sixth most current pixels in the edge. 

b. For the two pixels (xl,yl) and (x2,y2), the search direction is computed as 
follows: 

i. Compute the angle formed by the two points using the ATAN2 
function: 

angle = atan2(yl-y0,xl-x0) * 180 / n; 

ii. If angle is in the interval [-22.5, 22.5], the search direction is E. 

iii. If angle is in the interval [22.5, 67.5], the search direction is SE. 

iv. If angle is in the interval [67.5, 1 12.5], the search direction is S. 

v. If angle is in the interval [1 12.5, 157.5], the search direction is SW. 

vi. If angle is in the interval [-67.5, -22.5], the search direction is NE. 

vii. If angle is in the interval [-1 12.5, -67.5], the search direction is N 

viii. If angle is in the interval [-157.5, -1 12.5], the search direction is E. 
ix. Otherwise, the search direction is W. 

Edge Extension: 

a. It is the default to extend both the beginning and end of the edge. 
However, during the edge connection steps, if it is discovered that the 
edge originates close to a different edge, the edge is connected to the 
different edge and is not extended. If an edge ends by merging with 
another edge, the end of the edge is not extended. 

b. For both the beginning and the end of the edge: 

i. For Vertically oriented edge images (vEdgeOrient): 

1 . If the y-coordinate for the first/last point of the edge is less 
than the image height/6 or greater than 5*height/6, extend 
the beginning/end of the edge using the local slope method 
(described below). 

2. Otherwise, extend the beginning/end of the edge using the 
global slope method (described below). 
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ii. For Horizontally oriented edge images (HEdgeOrient): 

1 . If the x-coordinate for the first/last point of the edge is less 
than the image width/6 or greater than 5* width /6, extend 
the beginning/end of the edge using the local slope method 
(described below). 

2. Otherwise, extend the beginning/end of the edge using the 
global slope method (described below). 

c. Local Slope Extension: This method uses the slope of the edge near its 
beginning/end to determine the slope of the extending line. 

i. Compute two points for slope computation: 

1. the average of the four pixels from the beginning/end of the 
edge; and 

2. the average of the 6th through 9th pixels from the 
beginning/end of the edge. 

ii. Using the two computed points, the edge is extended from its 
beginning/end point using a line of the computed slope until it 
reaches the edge of the image. 

d. Global Slope Extension: this method uses pixel values between 20% and 
80% of the length along the edge to guess the "average" slope of this edge. 
Then the beginning/end of the edge is extended using this slope. 

i. If the edge has edgeLen pixels in it, select the points in the edge 
with the following indices: 

1. beglDX = round(edgeLen * 0.2); pointA = 
edge(beglDX); 

2. endlDX = round(edgeLen*0.8); pointB = 
edge(endlDX). 

ii. Compute the slope using pointA and pointB, and use a line of 
this slope to extend from the beginning/endpf this edge. 

e. After the extension is computed, the extended pixels are turned ON in the 
output edge image, and the corresponding pixels in skLabMat are 
assigned value -2. 
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6. Edge Erasing. 

When an edge is to be erased check to verify that for each pixel in the edge 
and its extension the label for the pixel is > 0. If so, set the value in the output 
Edge image and the label matrix to 0. This method assures that pixels in 
another edge that has already been linked are not erased (the two edges might 
have crossed). 

[0644] Finally, the method 2218 in Figure 110 includes mask computation in step 2230. The 
output of the Edge Linking algorithm is used to generate the vaginal wall mask in the following 
way: 

1 . Vertical connected-edge image: VConnlmg, a cumulative sum, is calculated 
for each row, starting from the center and extending both to the left and to the 
right. 

2. Horizontal connected-edge image: HConnlmg, a cumulative sum, is 
calculated for each column, starting from the center and extending both 
upward and downward. 

3. The two cumulative sums are thresholded at >=1 and OR-ed together to yield 
the final vaginal wall mask. 

[0645] Step 1454 in Figure 74 depicts the determination of a fluid-and-foam mask, [FL] V id, for 
an image of a tissue sample. This mask identifies fluid and foam regions appearing on tissue 
samples and is used in hard masking in the tissue characterization method 1438 of Figure 74. 
Figure 1 1 1 A depicts an exemplary image 2234 of cervical tissue used to determine the 
corresponding fluid-and-foam image mask, [FL] vid , 2238 shown in Figure 1 1 IB. 
[0646] In one embodiment, the fluid-and-foam image mask identifies regions where excess 
fluids and/or foam collect on cervical tissue. Excess fluid or foam can collect near the speculum, 
around or in the os, and/or in the folds between the vaginal walls and the cervix, for example. 
One embodiment of the fluid-and-foam image mask, [FL] V id, uses a measure of whiteness and a 
measure of blue-greenness to identify regions of fluid/foam. After extracting white and blue- 
green color features, thresholding and validation is performed to produce the final fluid-and- 
foam image mask, [FL] V id. 
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[0647] Figure 1 12 is a block diagram 2258 depicting steps in a method of determining a fluid- 
and-foam image mask, [FL] V id, for an image of cervical tissue. The following describes the steps 
of the method 2258 shown in Figure 112, according to one embodiment. 
[0648] The method 2258 in Figure 1 12 includes preprocessing in step 2260. First, remove 
5 glare from the RGB image. Retrieve or compute glare mask, glareMsk. Dilate glareWlsk 4 
times to obtain dilGlareMsk. Next, retrieve or compute ROI mask, ROIMsk. Finally, smooth 
each of the RGB channel using a 3x3 box car filter to remove noise. 

[0649] Next, the method 2258 in Figure 1 12 includes image color feature calculation in step 
2262. This step computes a "whiteness" image, Wimg, and a "green-blueness" image, GBImg. 
10 First, calculate the luminance L from the RGB signal using the formula: L = 0.299 * R + 0.587 * 
G + 0.1 14 * B. Next, compute, normalize and threshold Wimg as follows: 

1 . Wimg = abs((R - G)/( R + G))+abs((R - B)/( R + B))+abs((G - B)/( G + 
B)). 

This operation is a pixel-wise operation and is performed on each 
15 pixel sequentially. 

2. Normalize Wimg: Wimg = 3 -Wimg, 

3. Set low luminance pixels to 0 (low luminance pixels are unlikely to be in 
the fluid and foam regions): 

If L<mean(L),Wlmg = 0. 

20 Finally, compute, normalize and threshold BGImg as follows: 

1 . BGImg = (abs((R +30- G) /( R +30+ G))+abs((R +30- B) /( R +30+ 
B))+abs((G-B)/(G + B))). 

This operation is a pixel- wise operation and is performed on each pixel 
sequentially. 

25 2. Normalize BGImg, BGImg = 3 - BGImg- 

3. Set low.luminance pixels to 0 (low luminance pixels are unlikely to be in 
the fluid and foam regions): 

If L < 0.65 * mean(L), BGImg = 0. 

[0650] Next, the method 2258 in Figure 1 12 includes processing and segmenting bright green- 
30 bluish regions in steps 2264, 2266, 2268, 2270, 2272, 2274, and 2276. These steps are 
performed as follows: 



WO 2004/005895 



PCT/US2003/021347 



-186- 

1 . Retrieve or compute glare mask, glareMsk. 

2. Fill glare regions of BGImg using glareMsk to perform run-length boundary 
interpolation as follows: 

a. Raster scan each row of glareMsk to find all beginnings and endsof pixel 
runs. 

b. For each pixel P(x,y) in a given run specified by beginning point P(xb, y) 
and end point P(xe,y) in the intensity image, replace P(x,y) by half the 
linearly interpolated value at P(x,y) from P(xb 5 y) and P(xe,y). 

c. Raster scan each column of glareMsk to find all beginnings and ends of 
pixel runs. 

d. For each pixel P(x,y) in a given run specified by beginning point P(x, yb) 
and end point P(x,ye) in the intensity image, add to P(x,y) half the linearly 
interpolated value at P(x,y) from P(x,yb) and P(x,ye). 

3 . Eliminate low intensity areas using a threshold of 1.5: 

If BGImg < 1.5, BGImg =1.5. 

4 . Rescale the BGImg to [0, 1]: 

BGImg = BGImg -min(BGImg))/(3 -min(BGImg). 
5 . Compute thresholds from image statistics and perform thresholding. 

a. Compute image mean intensity, Imean, for BGImg > 0. 

b. Compute image standard deviation of intensity, IstdDev, for BGImg > 0. 
Compute threshold thGB, thGB = Imean +1.63 * IstdDev. 

c. Apply threshold limits: 
ifthGB<G.80, thGB = 0.80; 
ifthGB>0.92, thGB = 0.92. 

d. Threshold to get the initial green-bluish fluid and foam mask GBMask 
if BGImg > thGB, then 

GBMask =1; 
else 

GBMask = 0. 
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6. Perform morphological processing to fill small holes and smooth boundaries 
of the found regions in GBMask: 

a. Dilate the segmentation mask GBMask twice, GBMask = dil(GBMask, 

2). 

b. Erode the resultant mask three times, GBMask = erode(GBMask, 3). 

c. Dilate the resultant mask once, GBMask = dil(GBMask, 1). 

7. Perform binary region labeling and small region removal: 

a. Perform a connected components labeling, described above, to label all 
found regions. 

b. Compute each region area, area, and eccentricity, ecc. 

c. Remove small and round regions and small line segments that are not 
likely to be the fluid and foam regions: 

If ((area < 1000) AND (ecc < 0.70)) OR ((area < 300) AND (ecc > 
0.70)) OR (area < 1000), remove region. 

8. Green-Bluish feature validation for each found region is based on the original 
RGB values: 

a. For each found region, retrieve the mask, Imsk, and compute the mean 
intensities within the region for each of the red, green and blue channels as 
MRed, MGreen and Mblue. 

b. If the found region is tissue-like, remove the region: 

if [(MGreen - MRed)+(MBIue - MRed)] < -5 remove region. 

c. If the found region is too blue, remove the region: 
if (MBlue > MGreen +1 5) remove region. 

9. The final green-bluish fluid and foam mask, FGBMask, is calculated by 
performing a flood-fill of "on" valued regions of GBMask from step 5 with 
seeds in the validated regions from step 6 and step 7. 

[0651] Next, the method 2258 in figure 1 12 includes processing and segmenting pure white 
regions in steps 2278, 2280, 2282, 2284, 2286, 2288, and 2290. These steps are performed as 
follows: 

1 . Retrieve glare mask, glareMsk and ROI mask, ROIMsk. 
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2. Fill glare regions of Wlmg using glareMsk to perform run-length boundary 
interpolation as follows: 

a. Raster scan each row of glareMsk to find all beginnings and ends of 
pixel runs. 

b. For each pixel P(x,y) in a given run specified by beginning point P(xb, y) 
and end point P(xe,y) in the intensity image, replace P(x,y) by half the 
linearly interpolated value at P(x,y) from P(xb,y) and P(xe,y). 

c. Raster scan each column of glareMsk to find all beginnings and ends of 
pixel runs. 

d. For each pixel P(x,y) in a given run specified by beginning point P(x, yb) 
and end point P(x,ye) in the intensity image, add to P(x,y) half the linearly 
interpolated value at P(x,y) from P(x,yb) and P(x,ye). 

3. Compute Wlmg mean, mWlmg, and standard deviation, stdWlmg. 

4 . Eliminate low intensity areas: 

if Wlmg < mWlmg - 0.1* stdWlmg, Wlmg = mWlmg - 0.1* stdWlmg. 

5 . Rescale the Wlmg to [0, 1]: 

Wlmg =Wlmg - min(Wlmg))/(3 - min(Wlmg). 

6. Compute thresholds from image statistics and perform thresholding: 

a. Compute image mean intensity, Imean, for Wlmg > 0. 

b. Compute image standard deviation of intensity, IstdDev, for Wlmg > 0. 

c. Compute threshold thW, 

thW = Imean +1.10 * IstdDev. 

d. Threshold to get the initial green-bluish fluid and foam mask WMask: 
if ((Wlmg > thW) AND (pixel is included in ROlMsk)), then 

WMask -1; 
else 

WMask = 0. 

7. Perform morphological processing to fill small holes and smooth boundaries 
of the found regions in WMask: 
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a. Erode the segmentation mask WMask twice, WMask = erode(WMask, 2). 

b. Dilate the resultant mask three times, WMask = dilate(WMask, 3). 

8 . Perform binary region labeling and small region removal: 

a. Perform a connected components labeling, as described, to label all found 
regions. 

b. Compute each region area, area. 

c. Remove small regions that are not likely to the fluid and foam regions: 
If (area < 300) remove the region from the region list. 

9. Whiteness feature validation for each found region based on the original RGB 
values: 

a. For each found region, retrieve the mask, iMsk, and compute the mean intensities 
within the region for each of the red, green and blue channels as iMRed, 
iMGreen and iMBIue. 

a. Dilate i Msk five times to obtain iD1 Msk = dilate(i WIsk, 5). 

b. Compute the perimeter pixels iPeriMsk from iD1 Msk: 
iPeriMsk = not (erod (iDIMsk, 1)) AND (iDIMsk)), 1). 

c. Dilate iPeriMsk three times to get the outer mask: iD2Msk =» dilate (iPeriMsk, 
3). 

d. Compute mean intensities on iD2Msk for each of the R, G and B channels as 
perimeter (Outer) means: pMRed, pMGreen and pMBIue. 

e. Compute the Inner region green-blueness: 
innerGB = (iMGreen - iMRed)+( iMBIue - iMRed). 

f. Compute the Inner region whiteness: 

innerW = 3.0 - (abs((iMRed - iMGreen)/( iMRed + iMGreen)) + 

abs((iMGreen - iMBIue)/( iMGreen + iMBIue)) +abs((iMBIue 
- iMRed)/( iMBIue + iMRed))). 

g. Compute the Outer region whiteness: 

outerW = 3.0- (abs((pMRed - pMGreen)/( pMRed + pMGreen)) + 
abs((pMGreen - pMBIue)/( pMGreen + pMBIue)) + 
abs((pMBlue - pMRed)/( pMBIue + pMRed))). 
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h. Compute the Outer region redness: 

outerRed = (pMRed - pMGreen)+( pMRed - pWIBlue). 

i. Apply general whiteness validation rule: 

if (((innerGB < 10) AND (outerRed > 25)) OR (outerW > (innerW - 

5 0.1)), then: 

set isFluid to 0, since it is not likely to be a fluid and foam region; 
else, 

SetisFluidto 1. 
j . Very white fluid-foam validation rule: 
io If ((innerW > (outerW + 0.16)) set isFluid to 1. 

k. Very high inner green bluish fluid-foam validation rule: 
If (innerGB > 10) set isFluid to 1. 
1 0. The final white fluid-foam mask fWMask is calculated by performing a 
flood-fill of "on" valued regions of Mask from step 8 with seeds in the 
15 validated regions (isFluid = 1) from step 9. 

[0652] Finally, the method 2258 in Figure 112 includes constructing the final fluid-foam mask. 
The final fluid-foam mask is a logical "OR" of the two segmented and validated masks as 
follows: FluidFoamMask = fBGMask OR fWMask. 

Classifiers 

20 [0653] In one embodiment, the tissue characterization system 100 of Figure 1 comprises using 
broadband reflectance data obtained during a spectral scan of regions (interrogation points) of a 
tissue sample to determine probabilities that a given region belongs in one or more tissue- 
class/state-of-health categories. In one embodiment, probabilities of classification are 
determined as a combination of probabilities computed by two different statistical classifiers. 

25 The two classifiers are a DASCO classifier (discriminant analysis with shrunken covariances), 
and a DAFE classifier (discriminant analysis feature extraction). The DASCO classifier (step 
1484, Figure 74) uses a principal component analysis technique, and the DAFE classifier (step 
1482, Figure 74) uses a feature coordinate extraction technique to determine probabilities of 
classification. 

30 [0654] The embodiment shown in Figure 74 applies a necrosis mask 1424 and a hard 

"indeterminate" mask 1426 to a set of arbitrated broadband spectral data to eliminate the need to 
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further process certain necrotic and indeterminate interrogation points in the classification steps 
1482, 1484, 1486. After determining statistical classification probabilities in step 1486, the 
embodiment of Figure 74 applies a soft "indeterminate" mask 1428 as well as the NED (no 
evidence of disease) classification result 1430 in order to obtain a final characterization 1432 of 
each interrogation point on the tissue sample as Necrotic, CIN 2/3, NED, or Indeterminate. 
[0655] The statistical classifiers in steps 1482 and 1484 of Figure 74 each determine respective 
probabilities that a given region belongs to one of the following five tissue-class/state-of-health 
categories: (1) Normal squamous (N s ), (2) CIN 1 (d), (3) CIN 2/3 (C 23 ), (4) Metaplasia (M), 
and (5) Normal columnar (C 0 i) tissue. Other embodiments use one or more of the following 
tissue classes instead of or in addition to the categories above: CIN 2, CIN 3, NED (no evidence 
of disease), and cancer. The category with the highest computed probability is the category that 
best characterizes a given region according to the classifier used. In one alternative embodiment, 
other categories and/or another number of categories are used. The results of the two statistical 
classifiers are combined with the NED mask classification, along with the hard and soft 
"indeterminate" masks, to obtain a final characterization for each interrogation point 1432. 
[0656] In one embodiment, statistical classification includes comparing test spectral data to 
sets of reference spectral data (training data) representative of each of a number of classes. A 
collection of reference spectra from the same tissue class is a class data matrix. For example, a 
class data matrix 7} comprising reference spectra (training data) from samples having known 
class j is expressed as in Equation 96 as follows: 

'sm s x {^) ... sm p ) 



(96) 



where class j contains nj reference spectra, S(A), and each reference spectra, S(/l) = [S(X]) 9 S(X 2 )> 
. . . , S(Ap)], is a p-dimensional vector where p is the number of wavelengths in a measured 
spectrum. The class data matrix 2} has associated with it a class mean vector fij (a 1-by-p vector) 
and a class co variance matrix C y (a p-by-p matrix) as shown in Equations 97 - 99 as follows: 
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(99) 




Statistical- tissue classification uses reference data to determine for a given test spectrum to 



which class(es) and with what probabilities) that test spectrum can be assigned. 
[0657] The broadband data used in the statistical classifiers in steps 1482 and 1484 are 
wavelength truncated. For the DASCO classifier (step 1484), only training data and testing data 
that corresponds to wavelengths between about 400 nm and about 600 nm are used. For the 
DAFE classifier (step 1482), only training data and testing data that correspond to wavelengths 
between about 370 nm and about 650 nm are used. One alternative embodiment uses different 
wavelength ranges. The training data include reference broadband reflectance data from 
interrogation points having a known classification in one of the five states of health, and the 
testing data include broadband reflectance data from a region having an unknown classification. 
[0658] The discriminant analysis feature extraction (DAFE) method of step 1482 in Figure 74 
transforms a measurement of high dimension into a feature space of lower dimension. Here, the 
feature space is the orthogonal projection in the direction of maximal data discrimination. The 
DAFE method includes constructing feature coordinates by computing the feature space 
projection matrix. The projection matrix requires the inversion of the pooled within-groups 
covariance matrix, C poo i Where Tj, T 2 ,...,T g are training matrices for classes 1 through g (here, 
for example, g = 5), the number of reference spectra in a given class, nj, may be less than the 
number of wavelengths in a measured spectrum, p; and C pool is therefore singular and cannot be 
inverted. 

[0659] Thus, in one embodiment of the DAFE method of step 1482, the spectral measurements 
are subsampled so that a covariance matrix can be computed. In one embodiment, a 
subsampling rate, n 25 is determined according to Equation 100: 




n 2 = max 
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(100) 



WO 2004/005895 



PCT/US2003/021347 



10 



15 



-193- 

where p is the number of wavelengths in a measured spectrum; ni, n2, . . ., n g represent the 
numbers of reference spectra in each of classes 1, 2, g, respectively; and [ J indicates the 

"nearest integer" function. Typically, n 2 = 2 or 3 5 but values up to about 10 do not generally 
remove too much information from a measured reflectance spectrum, and may also be 
considered. After subsampling, the non-singular pooled covariance matrix, C poo u is computed 
according to Equation 101 as follows: 



n-gtt 

g 

*-i 



(101) 



where is the number of reference spectra in class k; and C* is the covariance matrix for class k. 
Then, the between-groups covariance, Cum, is computed according to Equation 102: 
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[0660] Next, the maxtrix P = C'J^ • C 4w ,„ is formed and singular value decomposition is 
applied to obtain the following: 

P^UDV (103) 
Let Ug-i equal the first g - 1 columns of the orthogonal matrix of singular values U as follows: 
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(104) 



Then, Hie feature projection, mapping measured space into feature space, is obtained via right- 
multiplication by C/ g _i. 

[0661] The DAFE classification algorithm (step 1482 of Figure 74) proceeds as follows. Let 

AAA A 

T x , T 2 , • • ■ , T g be the wavelength reduced, subsampled training (class data) matrices and S(A) be 

A A 

20 the corresponding wavelength reduced, subsampled test spectrum. The matrices T } and S(/l) are 
projected into feature space as follows: 



WO 2004/005895 



PCT/US2003/021347 



-194- 



J \ 8 3 (105) 

S(A)HS(A)t^ 

Next, the group mean vectors, group covariance matrices, and pooled within-groups covariance 
matrix are computed using the projection matrix, V h in Equation 105, and using Equations 97, 
5 98, and 101 as shown in Equations 106-108; 

fij = mean{Vj) (106) 
Cj = cov(Vj) (107) 

0™,= — £(nj-l)*Cj (108) 

Then,, the Friedman matrix is calculated using the Friedman parameters ^and X according to 
10 Equation 109 as follows: 

FrfrjL) = (1 - yW - X) Cj + X C pool ] + -^fr[(l - 2) C, + A ]• / to w) (109) 

In one embodiment, y- 0 and X = 0.5. Next, the Mahalanobis distance, dj{x), is determined from 
the test spectrum to each data class according to Equation 110: 

d](x) = (x- Mj ).Fr: x (r,V'{x-Mjy (HO) 
15 The Mahalanobis distance is a (1-by-l) number. Next, the Bayes' score is computed according 
to Equation 111: 

brj(x) = rfj(x)-21n( 0 ) + ln{det(F 0 ( r ,A)|) (111) 

The index j at which the minimum Bayes' score is attained indicates the classification having the 
highest probability for the test point in question. The D AFE probability of classification for 
20 class / can be computed for any of the g classifications according to Equation 1 12: 



ftohfreOM/)- , V, Ar- ' V ^ ^ ^ r (112) 



[0662] DAFE classification probabilities are computed thusly for each of the interrogation 
points having a test reflectance spectrum, S(/l), that is not eliminated in the Necrosis masking 
step (1424) or the hard "indeterminate" masking step (1426) in the embodiment shown in Figure 
25 74. 
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[0663] Step 1484 in Figure 74 is the DASCO (discriminant analysis with shrunken 
covariances) method. Like the DAFE method of step 1482, the DASCO method reduces the 
dimensionality of the measured space by transforming it into a lower dimensional feature space. 
DASCO differs from DAFE in that the feature space for the DASCO method is along orthogonal 
5 directions of maximal variance, not (necessarily) maximal discrimination. Also, DASCO uses 
two Mahalanobis distances, not just one. The first distance is the distance to feature centers in 
primary space and the second distance is the distance to feature centers in secondary space. 
[0664] In one embodiment, the DASCO method (step 1484) proceeds as follows. First, a 
collection {Ti, T2, T g } of nj-by-p training matrices is obtained from reference (training) 
10 broadband arbitrated reflectance measurements. The amount of reflectance spectral data 
obtained from a test region (interrogation point), as well as the amount of training data, are 
reduced by truncating the data sets to include only wavelengths between 400 nm and 600 nm. 
[0665] Next, the training data and test data are scaled using mean scaling (mean centering) as 
follows: 

15 Tj\^(Fj-Mj)ar J9 vibmAfj = 

S(A)i-> SiJLy^mSj (114) 
where j = 1, 2, . . g and g is the total number of tissue-class/state-of-health classes. The number 
of principal components in primary space is n p , and the number of principal components in 
secondary space is n s . The total number of components is n t . In one embodiment, n p = 3, n s = 1, 
20 and n t = 4. 

[0666] Next, the first n t principal component loadings and scores are computed. This involves 
computing the singular value decomposition of the mean scaled training data matrix Yj from 
Equation 113, as follows: 

Yj-UjDjVJ (115) 
25 A similar computation was made in Equation 104. Let V Jt „ be the matrix comprised of the first n, 

columns of Vj, The loadings and scores for Yj are therefore indicated, respectively, in Equations 
116 and 117, as follows: 

(116) 



JliXB 



(113) 
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sCj=Yj*Vj, n = YrLdj 
where Ldj is ap-by-n, matrix, and scj is an «,-by-«« matrix. 

[0667] The next step in the DASCO method is to compute the class mean scores and 
covariances. First, the class mean vector in primary space, v JiP , and the class mean vector in 
5 secondary space, v jiS , are computed as follows: 

Vj = meaniscj) (the mean is computed analogously to fxj in Equation 97) 

where v Jp = [v y ,„v y>2 ,-.,v^ J and v u = [v^ + _ ,^ >Vl ,-,v,, v „ j J 

Next, Cj = cov(scJ) is defined as the class covariance matrix analogous to that in Equation 100. 
10 In a manner similar to the computation of the primary and secondary space class mean vectors 
above, Q is decomposed into the primary (Cjj,) and secondary (C JiS ) space covariance matrices 
according to Equations 121 - 124 as follows: 



15 



(117) 



(118) 
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(124) 



_ C /i, ,»,+! 0') C W| ,,^+2 (7) 

[0668] Next, the scaled test spectrum from Equation 1 14 is projected into each principal 
component space according to Equation 125: 

x<j) = LdjSj (125) 
20 Then, x(f) is decomposed into primary and secondary space vectors as follows: 

*(/) - [XiCO, *20X . . *./»] = x JiP ® xj, s (126) 
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where x JtP = [Xj(/), x 2 (/)> • • •> x »^(/)l is the projection of x(/) into primary space and x Jt5 = [x„^+y(/) 3 

x« +2(/) 5 x„ +n (/)] is the projection of x(j) into secondary space. 
P P s 

[0669] The Mahalanobis distances in primary and secondary space are computed according to 
Equations 127 and 128 as follows: 

4>O0) = k,-v,>C- -vj (127) 

where F, v = fr ^^ • j . Then, the total distance is computed according to Equation 129 as 



follows: 



rf(xO-))=V4 P (xO))+^(x(;)) (129) 

10 [0670] The DASCO probability of class assignment to class j is obtained by computing the 
Bayes' score according to Equations 130 and 131 as follows: 

&r(xa)) = rf 2 (x(;))-21n(r,)+ln|det(C J> )|)+«, • lnfae^,,)!) (130) 



exp 

Prob(x(/) e Class;) = - ^ 2 ^ (131) 

2exp^i6r,( X (/c))j 

Equation 131 is evaluated for all classes j = 1, 2, . . . g. DASCO classification probabilities are 
15 computed thusly for each of the interrogation points having a test reflectance spectrum, S(>L) 3 that 
is not eliminated in the Necrosis masking step (1424) or the hard "indeterminate" masking step 
(1426) in the embodiment shown in Figure 74. 

[0671] Probabilities determined using the DAFE classifier in step 1482 of Figure 74 and 
probabilities determined using the DASCO classifier in step 1484 are combined and normalized 
20 in step 1486 to obtain for each interrogation point a set of statistical probabilities that the point 
belongs, respectively, to one of a number of tissue-class/state-of-health categories. In one 
embodiment, there are five classes, as described above, including the following: (1) Normal 
squamous (N s ), (2) CIN 1 (CO, (3) CIN 2/3 (C 23 ), (4) Metaplasia (M), and (5) Columnar (C 0 i) 
tissue. 

25 [0672] The probability matrices P D afe and Pdasco contain probability vectors corresponding to 
the interrogation points in the scan pattern and are expressed as shown in Equations 132 and 133 
as follows: 
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Pdafej. 0-) 
Pdafe* (2) 



Pdafej. (1) 
Pdafej. (2) 




(132) 



DAFE - 



PdafeMp) 



P D AFE, g (. ni P\ 





(133) 



DASCO - 



PdascoMp) P DASCO A ni P) 



' P DASCO 



where g is the total number of classes (for example, g = 5); nip is the total number of 
interrogation points for which DAFE and DASCO probabilities are calculated (for example, nip 
5 = up to 499); Pdafej 0) represents the DAFE probability that the interrogation point j belongs to 
class i; and Pdascoj 0) represents the DASCO probability that the interrogation point j belongs 
to class i. 

[0673] Step 1486 of Figure 74 represents the combination and normalization of classification 
probabilities determined by the DAFE and DASCO classifiers in steps 1482 and 1484, 
10 respectively. The combined/normalized probability matrix, Pcomb, is obtained by multiplying 
the probability matrices Pdafe and Pdasco (Equations 134 and 135) element-wise and dividing 
the row-wise product by the sum of each row's elements. 



[0674] The block diagram of Figure 74 includes steps representing the combination of spectral 
15 masks and image masks (1468, 1470, 1472, 1474), as well as the application of the combined 
masks (1466, 1476, 1424, 1478, 1480, 1424, 1426, 1428, 1430) in a tissue characterization 
system, according to one embodiment. These steps are discussed in more detail below. 
[0675] As discussed above, the Necrosis spe c mask identifies interrogation points whose spectral 
data are indicative of necrotic tissue. Since necrosis is one of the categories in which 
20 interrogation points are classified in step 1432 of Figure 74, the Necrosis spe c mask is used not 
only to eliminate interrogation points from further processing, but also to positively identify 
necrotic regions. Therefore, it is necessary to filter out points affected by certain artifacts that 
may erroneously cause a positive identification of necrosis. 

[0676] Step 1466 of Figure 74 indicates that two image masks are applied to the necrosis 
25 spectral mask - the smoke tube mask, [ST] V id, 1450 and the speculum mask, [SP] V id 1452. 

Regions in which a speculum or smoke tube has been identified cannot be positively identified as 
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necrotic. Thus, interrogation points having any portion covered by pixels indicated by the smoke 
tube mask, [ST] V id 5 1450 and/or the speculum mask, [SP] V id,1452 are identified as 
"Indeterminate" and are eliminated from the necrosis mask. 

[0677] Following this treatment, the necrosis mask is then applied in the broadband reflectance 
5 spectra classification sequence in step 1424 of Figure 74. Each interrogation point at which the 
necrosis mask applies is classified as "Necrotic". The broadband spectral data at these 
interrogation points are then eliminated from further processing, or, alternately, the results of the 
statistical classifiers at these points are ignored in favor of classification of the points as 
"Necrotic". Similarly, the necrosis mask is applied in the NED (no evidence of disease) spectral 
10 classification sequence in step 1476 of Figure 74. Each interrogation point at which the necrosis 
mask applies is classified as "Necrotic". The NED spe c mask need not be computed for these 
interrogation points, or, alternately, the results of the NED spec mask at these points may be 
ignored in favor of classification of the points as "Necrotic". 

[0678] Three image masks are combined to form a fluorescence hard mask, "F Hard," which is 
15 applied in the NED (no evidence of disease) spectral classification sequence in step 1478 of 
Figure 74, As discussed hereinabove, hard masking results in a characterization of 
"Indeterminate" at affected interrogation points, and no further classification computations are 
necessary for such points. The combined fluorescence hard mask, "F Hard," 1468 is a 
combination of the three image masks shown in Figure 74 (1448, 1450, 1452), according to 
20 Equation 134 as follows: 

F Hard - [ROIJvid OR [ST] V i d OR [SP] vid (134) 
The combined "F Hard" mask is applied in the NED spectral classification sequence in step 1478 
of Figure 74. Each interrogation point at which the "F Hard" mask applies is classified as 
"Indeterminate". The NED spec mask is not computed for these interrogation points. The "F 
25 Hard" mask applies for each interrogation point having any portion covered by pixels indicated 
by the "F Hard" combined image mask. 

[0679] Two spectral masks and five image masks are combined to form a broadband 
reflectance "hard" mask, which is applied in the broadband reflectance statistical classification 
sequence in step 1426 of Figure 74. The combined hard mask, "BB Hard", 1474 uses the image 
30 masks [ST]** [SP] V id 5 [ROI]vid, and [VW] V i d (1450, 1452, 1448, 1454) as hard masks, and also 
treats them as "anchors" to qualify the sections of the two spectral masks - [CE] spe c and [MU] spe c 
(1444, 1446) - that are used as hard masks. The outer rim of interrogation points in the spectral 
pattern is also used as an anchor to the spectral masks. Finally, the intersection of the fluid-and- 
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foam image mask \FL] yid (1456) and the mucus spectral mask [MU] spec (1446) is determined and 
used as a hard mask in "BB Hard" (1474). Each interrogation point at which the "BB Hard" 
mask applies is classified as "Indeterminate". The broadband spectral data at these interrogation 
points are then eliminated from further processing, or, alternately, the results of the statistical 
5 classifiers at these points are ignored in favor of classification of the points as "Indeterminate". 
[0680] In one embodiment, the combined hard mask, "BB Hard," 1 474 of Figure 74 is 
determined according to the following steps. 

[0681] First, form a combined image processing hard mask IPHardlPMsk using all the 
interrogation points (IP's) that have any portion covered by one or more of the followng image 

10 masks: [ST] v id, [SP] V id, [VW] V id and [ROq V id. The combined mask is expressed as: 

IPHardlPMsk = [ST]vid OR [SP] vid OR [VW] vid OR \ROI\ M . Extend IPHardlPMsk to 
include the level one and level two neighbors of the interrogation points indicated above. For 
example, each IP that is not on an edge has 6 level one neighbors and 12 level two neighbors, as 
shown in the scan pattern 202 in Figure 5. Let extlMHardlPMsk be the new mask. Add all 

15 outer rim interrogation points to extlMHardlPMsk to form anchorMsk. The rim is defined by 
the following interrogation points for the 499-point scan pattern 202 shown in Figure 5: 1-9, 17- 
20, 31-33, 47-48, 65-66, 84-85, 104-105, 125-126, 147-148, 170, 193, 215-216, 239, 263,286- 
287, 309, 332, 354-355, 376-377, 397-398, 417-418, 436-437, 454-455, 469-471, 482-485, 493- 
499. Form a combined spectral mask SpeclPMsk using all the interrogation points that are 

20 marked as either [CE] spe c or [MU] spe c (or both). Intersect the image processing anchor mask and 
the combined spectral mask to obtain SPHardMsk: SPHardMsk = anchorMsk AND 
SpeclPMsk. Intersect the image processing mask, [FL] V id, and spectral mucus mask, [MU] spcc , to 
obtain the fluid hard mask FluidHardlPMsk, FluidHardlPMsk = [FL] V id AND ([MU] spec OR 
[CE] spec ). Finally form the final hard mask: BBHard = IPHardlPMsk OR SPHardMsk OR 

25 FluidHardlPMsk. 

[0682] Two image masks - Bloody and Os V id (1458, 1460) - are combined to form a 
fluorescence "soft" mask, "F soft," 1470 which is applied in the NED spectral classification 
sequence in step 1480 of Figure 74. As discussed hereinabove, soft masking involves applying a 
weighting function to data from points identified by the mask in order to weight the data 

30 according to the likelihood they are affected by an artifact. The mask "F soft" determines two 
weighting functions - pen b i 0 od(IP) and peno S (IP) - for interrogation points (IP's) that are at least 
partially covered by the image masks Blood vid and Os v id (1458, 1460). As discussed 
hereinabove, a percentage coverage, a, is determined for each interrogation point according to 
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the percentage of pixels corresponding to the interrogation point that coincide with the image 
mask. For the image masks Blooded and Os V id, (1458, 1460), corresponding values a b i 0 od(IP) and 
a 0S (IP) are determined for each affected interrogation point, and Equations 135 and 136 are used 
to calculate the corresponding weighting at these interrogation points: 
5 pen bbod (IP) = 1 - abioodQP) (135) 

peno S (IP) = l-aos(IP) (136) 
The application of penbi 0 od(IP) and penos(IP) in the NED spectral classification sequence of step 
1480 is discussed in more detail below. 

[0683] Two image masks - Glare V id and Mucus V id (1462, 1464) - are combined to form a 
10 broadband reflectance "soft" mask, "BB soft", 1472 which is applied in the broadband 
reflectance statistical classification sequence in step 1428 of Figure 74. As discussed 
hereinabove, soft masking involves applying a weighting function to data from points identified 
by the mask in order to weight the data according to the likelihood it is affected by an artifact. 
The mask "BB soft" determines two weighting functions - pen g iare(IP) and pen mucus (IP) - for 
15 interrogation points (IP's) that are at least partially covered by the image masks Glare vid and 
Mucusvid (1462, 1464). As discussed hereinabove, a percentage coverage, a, is determined for 
each interrogation point according to the percentage of pixels corresponding to the interrogation 
point that coincide with the image mask. For the image masks Glare^d and Mucus^d, (1462, 
1464) corresponding values (XgiaretlP) and a mucus (IP) are determined for each affected 
20 interrogation point, and Equations 137 and 138 are used to calculate the corresponding penalties 
at these interrogation points: 

pen gIare (IP) = 1 - {a glare (IP)} 1/5 (137) 
pen mucus (IP) = 1 - (Wus(IP) (138) 
The application of pen g !are(IP) and pen muC us(IP) in the broadband reflectance statistical 

25 classification sequence at step 1428 is discussed in more detail below. 

[0684] The tissue-class/state-of-health classification of interrogation points includes the 
application of masks as determined above. These steps are shown in Figure 74. The tissue- 
class/state-of-health classification method includes an NED (no evidence of disease) spectral 
classification sequence, as well as a broadband reflectance statistical classification sequence, that 

30 apply the combined hard masks and soft masks described above. As discussed hereinabove, the 
separate identification of necrotic regions and NED regions based on at least partially heuristic 
techniques allows for the development of a statistical classifier that concentrates on identifying 
tissue less conducive to heuristic classification, for example, CIN 2/3 tissue. Furthermore, by 
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eliminating data affected by artifacts, the statistical classifiers are further improved, leading to 
improved sensitivity and specificity of the final classification of a tissue sample. 
[0685] The Necrosis mask (1424, 1476), "BB Hard" mask (1426), and "F Hard" mask (1478) 
are applied as shown in Figure 74. Interrogation points coinciding with these masks are 
5 identified as either "Necrotic" or "Indeterminate", as discussed hereinabove. In one 

embodiment, these regions are removed from further consideration. The NED classification 
sequence then applies the "F Soft" mask in step 1480. This is performed as explained below. 
[0686] The NEDspcc mask identifies interrogation points that indicate normal squamous tissue, 
which is class (1) of the five classes used by the DAFE and DASCO classifiers discussed 

10 previously. The NED spe c mask assigns at each indicated (masked) interrogation point a 

probability vector p s = [1, 0, . . ., 0], where the normal squamous classification probability, N s 
(class 1), is set equal to 1 and all other class probabilities are set equal to 0. The "F Soft" mask 
is applied in step 1480 by multiplying the N s probability of indicated (masked) NED 
interrogation points by the product of the blood and os weighting functions, penbi 00 d(IP) ' 

15 penos(IP). Hence, the normal squamous classification probability, Ns, at these points will be less 
than 1.0. If the product, penb] 0 od(IP) * peno S (IP), is equal to 0, then the interrogation point IP is 
classified as "Indeterminate". The NED spe c mask probability vector p s = 0 for all other 
interrogation points. It is noted that if an interrogation point is not identified by the NED spcc 
mask, its N s probability calculated by the broadband reflectance statistical classification 

20 sequence is unaffected. The application of the overall NED spec mask is explained below in the 
discussion of step 1430 in Figure 74. 

[0687] The broadband reflectance statistical classification sequence applies the Necrosis mask 
(1424) and the "BB Hard" mask (1426) before determining statistical classification probabilities 
in steps 1482, 1484, and 1486. As discussed above, the output of the broadband statistical 

25 classification is the probability matrix, Pcom, made up of probability vectors for the 

interrogation points, each vector indicating respective probabilities that a given interrogation 
point belongs to one of the five tissue-class/state-of-health categories — (1) Normal squamous 
(N s ), (2) CIN 1 (CO, (3) CIN 2/3 (C 23 ), (4) Metaplasia (M), and (5) Columnar (C ol ) tissue. The 
broadband reflectance statistical classification sequence then applies the "BB Soft" mask in step 

30 1428 by multiplying all five probabilities for each affected (masked) interrogation point by the 
quantity pengi^^) * pen mucus (IP). 

[0688] Step 1432 of Figure 74 classifies each interrogation point as Necrotic, CIN 2/3, NED, 
or Indeterminate. In one embodiment, the probabilities in Pcomb that correspond to CIN 2/3 
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classification 9j pcoM? f c23(IP) [class 3], are considered indicative of "CIN 2/3" classification in 
step 1432, and all other classification categories in Pcomb - classes 1, 2, 4, and 5 (N s , Q, M, and 
C 0 |) - are considered indicative of "NED" tissue. In an alternative embodiment, further 
classification distinctions are made in step 1432. 
5 [0689] In step 1430 of Figure 74, the results of the NED spe c mask are applied to the broadband 
reflectance-based classifications, Pcomb- The "Necrotic" interrogation points and the hard- 
masked "Indeterminate" points have been identified and removed before step 1430. In step 
1430, the remaining interrogation points are either classified as "Indeterminate" or are assigned a 
value of CIN 2/3 classification probability, p C 23<JP)- Here,/?c2j(IP) is the CIN 2/3 classification 

10 probability for interrogation point IP that is set as a result of step 1430. Interrogation points that 
are not identified by the NED spe c mask have been assigned NED spe c mask probability vector p s = 
0, and pc23(JP) = Pcom,C23(JP) for these points. Interrogation points that are identified by the 
NED mask have;?, = [1, 0, 0], orp 5 - [{pen blood (IP) ■ pencil*)}, 0, 0], (where p s , Ns (JP) = 
1 or pen b iood(IP) " perio S (IP)) depending on whether the point has been penalized or not by the "F 

15 Soft" mask in step 1480. The following describes how values of pc23(JP) are determined for 
interrogation pionts that are identified by the NED spe c mask: 

Due to spectral arbitration in step 128 of Figure 74, the broadband signal may 
have been suppressed for some interrogation points, and only fluorescence spectra 
are available. For these interrogation points, the following rules are applied in 

20 step 1430 of Figure 74: 

1. IF^(IP)>0,THEN p C 23(JP) = 0. j 

2. ELSE the interrogation point IP is classified as 
"Indeterminate". 

For points having a valid arbitrated broadband signal and fluorescence signal, the 
25 following fules are applied in step 1430 of Figure 74: 

1. WptfsQP) - 1, TBDEN/?C23(1P) = 0. 

2. IFp S) Ns (IP) = 0, THENp C 23(IP) =Pcomc2*P)- 

3. IF^(IP)<1,THEN: 

IF a* (IP) <Pcomb.Ns(JP) 9 THENpraOP) = PcomwQV), 
30 ELSE,pa3(IP) = 0. 

[0690] Step 1432 of Figure 74 classifies each interrogation point as Necrotic, CIN 2/3, NED, 
or Indeterminate. Necrotic and hard-masked Indeterminate interrogation points are identified 
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prior to step 1430, as described above. In step 1430, the remaining interrogation points are either 
classified as Indeterminate or are assigned a value of pc23$¥\ For these points, if pc23(JP) = 0, 
the point is classified as NED. If pc23(JP) > 0, the point is considered to have a non-zero 
probability of high grade disease (CIN 2/3). In one embodiment, disease display (step 138 of 
5 Figure 74) uses these non-zero pc23<JP) values to distinguish regions having low probability of 
CIN 2/3 and regions having high probability of CIN 2/3. 

[0691] Step 1434 of Figure 74 represents post-classification processing. In one embodiment, 
this includes a final clean-up step to remove isolated CIN 2/3 -classified interrogation points on 
the outer rim of the spectral scan pattern (for example, the outer rim consists of the numbered 
10 interrogation points listed hereinabove. A CIN 2/3 -classified interrogation point is considered 
isolated if it has no direct, level-1 neighbors that are classified as CIN 2/3. Such isolated points 
are re-classified as "Indeterminate" in step 1434 of Figure 74. 

Image enhancement 

[0692] The brightness of an acquired image of a tissue sample may change from patient to 
15 patient due to obstructions, tissue type, and other factors. As a result, some images may be too 
dark for adequate visual assessment. Step 126 of the tissue characterization system 100 of 
Figure 1 performs an image visual enhancement method to improve the image visual quality, 
using an image intensity transformation method. The improved image may then be used, for 
example, in the disease display of step 138 of Figure 1. 
20 [0693] In one embodiment, the visual enhancement method of step 126 in Figure 1 involves 
analyzing the histogram of the luminance values of an input image, determining luminance 
statistics using only portions of the image corresponding to tissue, and performing a piecewise 
linear transformation to produce a visually enhanced image. Step 126 involves using the image 
masks, as shown in step 108 of Figures 1 and 73 and as described previously, in order to 
25 determine which portions of the image are used to compute the image statistics. Step 126 
includes performing brightness and contrast enhancement, as well as applying image feature 
enhancement to improve local image features such as edges, borders, and textures of different 
tissue types. Finally, a color balancing correction is applied to reduce the redness in certain 
images. 

30 [0694] The visual enhancement method of step 1 26 includes determining which portions of the 
input tissue image correspond to tissue in the region of interest, as opposed to artifacts such as 
glare, mucus, a speculum, the os, blood, smoke tube, and/or areas outside the region of interest. 
Only the regions corresponding to tissue of interest are used in determining luminance statistics 
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used in performing the visual enhancement. In one embodiment, the image masks of Figure 73 
and 74 are used to determine the portion of the image corresponding to tissue of interest. In one 
embodiment, this image portion is [tROI] V id, a subset of the [ROI] V id mask, computed in Equation 
139 as follows: 



where the image masks above are as shown in Figure 74 and as described above. 
[0695] Figure 1 13A-C show graphs representing a step in a method of image visual 
enhancement in which a piecewise linear transformation of an input image produces an output 
image with enhanced image brightness and contrast. A histogram 2328 is computed for the 

10 luminance values \a (2326) of pixels within [tROI] V id of an input image, and the histogram is used 
to determine parameters of a piecewise linear transformation shown in the plot 2324 of Figure 
1 13B. The transformation produces luminance values v (2330) of a corresponding brightness- 
and contrast-enhanced output image. The transformed image generally has a wider range of 
luminance values, stretching from the minimum intensity (0) to the maximum intensity (255), 

15 than the input image. The luminance values from the input image are transformed so that input 
luminance values within a given range of the mean luminance are stretched over a wider range of 
the luminance spectrum than input luminance at the extremes. In one embodiment, the 
piecewise linear transformation is as shown in Equation 140: 



20 where L max is the maximum luminance value of a pixel within [tROI] V id of the input image; the 
parameters jib, v a , and Vb are piecewise linear breakpoints; and ct,p, and y are slopes of the 
transformation. 

[0696] In one embodiment, the image brightness and contrast enhancement is performed 
according to the following steps. First, calculate the luminance L from the RGB signal of the 
25 input image using the formula: L = 0.299 * R + 0.587 * G + 0.1 14 * B. Extract the luminance 
image LROl within tROI ([tROI] vi d): LROI = L AND tROI. Compute LROI mean, 
IWIean.Compute the piecewise linear breakpoints ma, mb, na, nb (|0a, p, b , v a , and v b ) from the 
LROI histogram, nHist[ ], as follows: 



5 



[tROI]vid = [ROI]vid - {[Glare]vid + [SP] vid + [os] V id + Blood vid + Mucus v i d + [ST] vid } (139) 



<*M> L min <M<Ma 
P(M-Ma) + V a> Ma * M < Mb 

7(M-M b ) + v b , M b 



(140) 



30 



1. If((IMean>38)AND(IMean<132)): 

a. Compute and normalize nHist[ ] to the range [0, 1]. 
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b. Compute ma and mb, the 5% and 98% histogram tails: 

ma = i, if sum(nHist [i]) > 0.05, i = 0 to 255. 
mb = i, if sum(nHist [i]) > 0.98, i = 0 to 255. 

c. Define the expected low and high intensity parameter na and nb: 

d. na = 0andnb = 180. 

2. If (IMean > 38 AND (IWIean < 132) AND ((ma > na AND ma < 100 AND nb 
> 20)), compute the slope or the degree of enhancement, bcDOE: 

bcDOE = (nb - na) / (mb - ma). 

3. If ((IMean > 38) AND (IMean < 132)), apply brightness and contrast 
enhancement transformation to input color image in RGB to obtain be RGB 
(brightness and contrast enhanced color image). 

[0697] In addition to producing an output image with enhanced image brightness and contrast, 
the visual enhancement method of step 126 (Figure 1) also includes performing an image feature 
(local contrast) enhancement of the output image to emphasize high frequency components such 
as edges and fine features for the purposes of visual inspection. In one embodiment, image 
feature enhancement is performed using a spatial filtering technique according to Equations 141 
and 142 as follows: 



where G(m, n) is the gradient image; p is the degree of the enhancement; I^m, n) and I ou t(m, n) 
are the original and the resultant image of the feature enhancement operation; 
and S(m, n) is the smoothed (lowpass filtered) version of I in (m, n). 
[0698] In one embodiment, the image feature enhancement operation of the visual 
enhancement method of step 126 is performed according to the following steps: 
IflMean>38: 

1 . Smooth bcRGB (brightness and contrast enhanced color image) with a 7x7 
boxcar filter to obtain smRGB. 

2. Subtract smRGB from bcRGB to obtain the gradient image, grRGB. 

3. Dilate glareMsk twice to obtain dGlareMsk = dil (glareMsk, 2). 



IouA m > n ) = 4(™>") + pG(m,n) 
G(m,n) = I in (m,n) - S(m,ri) 



(141) 



(142) 
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4. Remove dilated glare regions form gradient image to avoid emphasizing glare 
regions: 

a. Convert gray image dGlareMsk to RGB image, dGlareMskC. 

b. Remove glare image from gradient image to obtain grRGBgl: 
grRGBgl = grRGB - dGlareMskC. 

5. Define the degree of feature enhancement, feDOE, from experiments, feDOE = 0.8. 

6. Scale grRGBgl by feDOE to obtain feRGB. 

7. Add feRGB to bcRGB to produce image feature enhanced image fRGB. 

[0699] In addition to producing an output image with enhanced image brightness, contrast, and 
image features, the visual enhancement method of step 126 (Figure 1) also includes performing 
color balancing to reduce redness in certain overly-red tissue images, based on a mean-red-to- 
mean-blue ratio. 

[0700] In one embodiment, the color balancing operation of the visual enhancement method of 
step 126 is performed according to the following steps: 
IflMean>38: 

1. Split RGB (i.e. of the image feature enhanced image fRGB) into R, G, B. 

2. Extract the R image (within the tROIMsk) and compute mean tissue redness, tRed. 

3. Extract the B image (within the tROIMsk) and compute mean tissue blueness tBIue. 

4. Compute the red-blue ratio as RBRat = tRed / tBIue. 

5 . Perform color balancing: 

If RBRat < 1.20, no red redection. 
Else if RBRat >=1.20 AND RBRat < 1.32, R = 0.95* R. 
Else if RBRat >=1. 32 AND RBRat < 1.55, R = 0.90* R. 
Else if RBRat >=1 .55, R = 0.85*v. 

6. Combine the R, G and B channels to form the final color image for display. 

Diagnostic display 

[0701] In one embodiment, the tissue characterization system 100 of Figure 1 comprises 
producing a disease probability display 138 for a reference (base) image of a test tissue sample 
using the interrogation point classifications in step 1432 of Figure 74 -Necrotic, CIN 2/3, NED, 
and Indeterminate. A method of disease probability display 138 includes producing an output 
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overlay image with annotations for indeterminate regions, necrotic regions, and/or regions of 
low-to-high probability of high-grade disease, according to the classifications determined in step 
1432 of Figure 74 for a given patient scan. The annotations are shown as an overlay on top of 
the reference tissue image to provide easily-discernible tissue classification results, for example, 
5 indicating regions of concern for the purposes of biopsy, treatment, diagnosis, and/or further 
examination. 

[0702] In one embodiment, indeterminate regions are indicated by a gray "see-through" 
Crosshatch pattern that only partially obscures the underlying reference image. Necrotic regions 
are indicated by a green trellis pattern. Regions of tissue associated with high-grade disease (for 
10 example, CIN 2/3) are indicated by patches of contrasting color which intensify according to the 
likelihood of high-grade disease. 

[0703] In one embodiment, the disease probability display method 138 of Figure 74 as applied 
to a reference image of tissue from a patient scan includes the following steps: determining a 
disease display layer from the classification results of step 1432, overlaying the disease display 

15 layer on the reference image, determining an "indeterminate" mask from the classification 

results, overlaying the indeterminate mask on the disease display image using a gray Crosshatch 
pattern, determining a "necrosis" mask from the classification results, and overlaying the 
necrosis mask on the disease display image using a green trellis pattern. The result of the disease 
probability display method 138 of Figure 74 is a state-of-health "map" of the tissue sample, with 

20 annotations indicating indeterminate regions, necrotic regions, and/or regions of low-to-high 
probability of high-grade disease. 

[0704] Figure 1 14B represents an exemplary image of cervical tissue 2358 obtained during a 
patient examination and used as a reference (base) image in constructing an output overlay 
image in the disease probability display method 138 in Figure 74. Figure 1 14B shows the output 
25 overlay image 2360 produced by the disease probability display method 138 in Figure 74 that 
corresponds to the reference image 2358 in Figure 1 14A. The output overlay image 2360 in 
Figure 114B contains annotations indicating indeterminate regions (2366), regions associated 
with a low probability of CIN 2/3 (2362), and regions associated with a high probability of CIN 
2/3 (2364). 

30 [0705] The disease probability display method 1 3 8 begins with the determination of a disease 
display layer from the CIN 2/3 classification results of step 1432 in Figure 74. In step 1432, 
values ofpra(IP) are determined for interrogation points having a non-zero probability of high- 
grade disease (here, CIN 2/3). An area of tissue indicative of high-grade disease is represented 
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on the disease display layer as an area whose color varies from yellow-to-blue, depending on 
values of /?c2j(IP) at corresponding interrogation points. The yellow color represents low 
probability of high-grade disease, and the blue color represents high probability of high-grade 
disease. At the low end of the probability range, the yellow color is blended into the reference 
5 image so that there is no sharp discontinuity between the high-grade disease region and the 
image. In one embodiment, a minimum cut-off probability, pc23mm(JP), is set so that 
interrogation points with values of pc23(JP) lower than the minimum cut-off do not show on the 
disease display layer. In one embodiment, pc23mm(JP) - 0.2. 

[0706] Figures 1 1 5 A and 1 15B represent two stages in the creation of a disease display layer, 
10 according to one embodiment. Figure 1 1 5A shows the disease display layer 2368 wherein high- 
grade disease probabilities are represented by circles with intensities scaled by values of pc23(JP) 
at corresponding interrogation points. In order to more realistically represent regions of high- 
grade disease on the tissue sample, the circles in Figure 1 15A are replaced with cones, then 
filtered to produce the disease display layer 2372 shown in Figure 1 15B. 
15 [0707] Finally, the grayscale intensity values are converted to a color scale so that regions of 
high-grade disease appear on the overlay image as patches of contrasting color that intensify 
according to the likelihood of disease. 

[0708] In one embodiment, the disease probability display method 138 of Figure 1 includes 
creating a disease display layer according to the following steps: 
20 1 . Retrieve the reference image (base image). 

2. If all IPs are indeterminate, skip to creating the Indeterminate Mask. 

3. Generate CIN 2/3 probability image, l p , of base image size, for all non- 
indeterminate IPs: 

a. Generate a regular truncated cone centered at (15,15) on a square matrix 
25 of size 29-by-29, set to 0: 

i. The two truncating circles are centered around (15,15) and have a 
radius Ro = 14 and Rj = 6. 

ii. For each cone point, cone(i, j), let R be the distance from the 
geometric center (15,15). 

30 1. If R>R 0 ,cone(i,j) = 0. 

2. If R<Ru cone(i,j)= 1. 
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3. If Rj<=R <= R 0 , cone (i, j) = (R 0 - R) / (R 0 - Ri). 

b. Initialize L to 0. 

c. For each IP with probability pc23(JP) > 0.2: 

i . make a copy of the cone; 

ii. scale it by p; 

iii. add it to I p with the cone's center aligned with the IP location. 

d. Smooth l p using a 33 by 33 separable symmetric Hamming window filter 
specified by: 

i. the following coefficients (since the filter is symmetric around the 
origin, only 17 coefficients are specified below; the others are the 
mirror image around 1 .0): 

(0.0800, 0.0888, 0.1150, 0.1575, 0.2147, 0.2844, 0.3640, 0.4503 
0.5400, 0.6297, 0.7160, 0.7956, 0.8653, 0.9225, 0.965, 
0.9912,1.0); 

ii. againof(0.85/301.37) iy2 for the 33 point ID filter. 

e. Linearly rescale l p from the [0.2 1] range to the [0 1] range. 

f. CUp rescaled l p to range [0 1]. 

Compute an RGB colormap image and an alpha blending channel from the 
probability image l p . The colormap defines a transformation from integer 
intensity values in the range [0,255] to an RGBa image. 

a. The R colormap is a piecewise linear map specified by the following 
breakpoints [0,255], [97,220], [179,138] and [255,0]. 

b. The G colormap is a piecewise linear map specified by the following 
breakpoints [0,0], [81,50], [210,162] and [255,92]. 

c The B colormap is a piecewise linear map specified by the following 
breakpoints [0,255], [120,225], [178,251] and [255,255]. 

d. The a colormap is a piecewise linear map specified by the following 
breakpoints [0,255], [120,225], [178,251] and [255,255]. 

e. Convert the floating point l p image to an 8-bit image, in the range [0,255] 
by rounding the product of each l p image pixel by 255. 
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f. Use the tissue colormap to get RGBa pixel values for the disease display layer. 
[0709] Figure 1 16 shows the color transformation used in overlaying the disease display layer 
onto the reference image, as in the overlay image 2360 of Figure 1 14B. The first colorbar 2374 
in Figure 116 shows the blended colors from yellow to blue that correspond to values of disease 

5 probability p C 23(JP), depicted on the x-axis 2375. A color corresponding to the average tissue 
color is determined, as shown in colorbar 2378. The average tissue color is blended into the 
probability-correlated yellow-to-blue colorbar 2374 so that the yellow color is blended into the 
reference image where the disease probability, as indicated by the filtered disease display layer, 
is low. This avoids a sharp discontinuity between the disease map and the tissue. In one 

10 embodiment, the disease display layer and the base (reference) image are combined by using 
alpha-channel blending, where the alpha channel is as shown in step #4 of the above method to 
create a disease display layer. The disease display layer is overlaid upon the base image with 
blending controlled by the computed alpha channel values according to Equation 143 as follows: 
(Overlay Image Pixel) = a(Disease Display Layer Pixel) + (l-a)-(Base Image Pixel) (143) 

15 [0710] Next, the disease probability display method 138 of Figure 1 includes determining an 
"indeterminate" mask from the classification results in step 1432 of Figure 74, where 
indeterminate regions are indicated by a gray "see-through" Crosshatch pattern. For an 
exemplary reference image, interrogation points classified as "Indeterminate" in step 1432 of 
Figure 74 indicate where the indeterminate mask is activated. The indeterminate Crosshatch 

20 mask is then combined with the output overlay image, as is shown in the overlay image 2360 of 
Figure 1 14B. Here, indeterminate regions 2366 are indicated in shadowed regions around the 
edge of the tissue sample. 

[0711] In one embodiment, the disease probability display method 138 of Figure 1 includes 
creating an indeterminate Crosshatch mask according to the following steps: 
25 1. Create image, msk, of base image size and set to 0. 

2. Draw disks of radius 0.75 mm centered at the coordinate of each 
indeterminate interrogation point. 

3. Erode mask image 3 times to obtain erodMsk = erod (msk, 3). 

4. Compute image binary perimeter, perWIsk, of erodMsk: 
30 perMsk = not (erod (erodMsk, 1)) AND (erodMsk)), 1). 

5 . Compute indeterminate Crosshatch mask: 
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a. Retrieve Crosshatch image, xhatch, defined by a horizontal pitch of 1 0 
pixels, a vertical pitch of 20 pixels, a Crosshatch slope of 2 and a grey 
value of (166,166,166). 

b. Perform logical OR of erodMsk and xhatch to obtain xhatchMsk. 

c. Perform logical OR of xhatchMsk with perMsk. 

[0712] Next, the disease probability display method 138 of Figure 1 includes determining a 
"necrosis" mask from the classification results in step 1432 of Figure 74, where necrotic regions 
are indicated by a green "see-through" trellis pattern. Figure 1 17A depicts an exemplary 
reference image 2388 of cervical tissue having necrotic regions. For an exemplary reference 
image, interrogation points classified as "Necrotic" in step 1432 of Figure 74 indicate where the 
"necrosis" mask is activated. A necrosis trellis mask is included in the overlay image, as is 
shown in the overlay image 2396 of Figure 1 17B. 

[0713] In one embodiment, the disease probability display method 138 of Figure 1 includes 
creating a necrosis trellis mask according to the following steps: 

1 . Create image, msk, of base image size, and set it to 0. 

2. Draw disks of radius 0.75 mm centered at the coordinate of each necrotic 
tissue interrogation point. 

3. Erode mask image 3 times to obtain erodMsk = erod (msk, 3). 

4. Compute image binary perimeter, perMsk, of erodMsk: 
perMsk = not (erod (erodMsk, 1)) AND (erodMsk)), 1). 

5. Compute necrotic tissue trellis mask: 

a. Retrieve trellis image, trellis, defined by a horizontal pitch of 8 pixels, a 
vertical pitch of 8 pixels, a line thickness of 2 and a green value of 
(0,255,104). 

b. Perform logical OR of erodMsk and xhatch to obtain trellisMsk. 

c. Perform logical OR of trellisMsk with perMsk. 

[0714] The result of the disease probability display method 1 3 8 of Figure 74 is a state-of- 
health "map" of a tissue sample, with annotations indicating indeterminate regions, necrotic 
regions, and/or regions of low-to-high probability of high-grade disease. The disease display 
overlay images contain indeterminate regions and regions of low-to-high probability of CENT 2/3. 
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[0715] In one embodiment, the disease display overlay image is produced immediately 
following a patient scan in which spectral and image data are acquired and processed. This 
allows a physicial to provide on-the-spot diagnostic review immediately following the scan. 
EQUIVALENTS 

5 [0716] While the invention has been particularly shown and described with reference to 
specific preferred embodiments, it should be understood by those skilled in the art that various 
changes in form and detail may be made therein without departing from the spirit and scope of 
the invention as defined by the appended claims. 
What is claimed is: 
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CLAIMS 

1 1 . A method of characterizing the condition of a region of a tissue sample, the method 

2 comprising the steps of: 

3 (a) determining at least one of: 

4 (i) whether a region of a tissue sample lies outside a zone of interest; and 

5 (ii) whether optical data obtained from said region are affected by an obstruction; 

6 (b) processing a set of optical data obtained from said region to determine one or 

7 more tissue-class probabilities; and 

8 (c) characterizing a condition of said region based on results of said determining step 

9 and said processing step. 

1 2. The method of claim 1 , wherein said optical data are spectral data. 

1 3. The method of claim 1, wherein said condition is selected from the group consisting of 

2 indeterminate, CIN 2/3, NED, and necrotic. 

1 4. The method of claim 1 , wherein tissue-class probability is a probability that said region 

2 comprises tissue of a predetermined type, wherein said type is selected from the group consisting 

3 of CIN 1, CIN 2, CIN 3, CIN 2/3, normal squamous, normal columnar, necrosis, NED, 

4 metaplasia, and cancer. 

1 5. The method of claim 1 , wherein said one or more tissue-class probabilities comprise a 

2 normal squamous probability, a normal columnar probability, a CIN 1 probability, a CIN 2/3 

3 probability, and a metaplasia probability. 

1 6. The method of claim 1, wherein said condition is indeterminate if said region is 

2 determined to lie outside said zone of interest. 

1 7. The method of claim 1, wherein said condition is indeterminate if spectral data obtained 

2 from said region are determined to be affected by an obstruction. 

1 8. The method of claim 1, wherein said processing step comprises weighting spectral data in 

2 a statistical classification technique. 

1 9. The method of claim 1, wherein said one or more tissue-class probabilities are weighted 

2 according to a likelihood that a point within said region lies outside said zone of interest. 

1 10. The method of claim 1 , wherein said one or more tissue-class probabilities are weighted 

2 according to a likelihood that spectral data obtained from said region are affected by an 

3 obstruction. 

1 11. The method of claim 1 , wherein said determining step is based at least in part on image 

2 data obtained from said region. 
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1 12. The method of claim 1 1, wherein said image data comprise data of a type selected from 

2 the group consisting of RGB intensity, red intensity, green intensity, blue intensity, grayscale 

3 luminance, and measured radiant power. 

1 13. The method of claim 1, wherein said determining step is based at least in part on spectral 

2 data obtained from said region. 

1 14. The method of claim 1 , wherein said determining step is based at least in part on image 

2 data and spectral data obtained from said region. 

1 1 5. The method of claim 1 , wherein said determining step comprises identifying from said 

2 tissue sample at least one member selected from the group consisting of a region of interest, a 

3 vaginal wall area, a smoke tube area, an os area, and a cervical edge area. 

1 1 6. The method of claim 1 , wherein said obstruction comprises at least one member selected 

2 from the group consisting of mucus, fluid, foam, a portion of a speculum, glare, shadow, and 

3 blood. 

1 17. The method of claim 1, further comprising obtaining a first set of data and a second set of 

2 data from said region, and determining whether either of said first set and said second set is 

3 affected by an artifact. 

l 1 8. The method of claim 1 7, wherein said second set is redundant with said first set. 

1 19. The method of claim 17, wherein said first set comprises spectral data obtained from said 

2 region using light incident to said region at a first angle, and said second set comprises spectral 

3 data obtained from said region using light incident to said region at a second angle. 

1 20. The method of claim 17, wherein said first set and said second set comprise reflectance 

2 data 

1 21. The method of claim 1 , wherein said processing step comprises using spectral data to 

2 evaluate a necrosis metric, and wherein said characterizing step comprises characterizing the 

3 condition of said region as necrotic if said metric is satisfied. 

1 22. The method of claim 1, wherein said processing step comprises using spectral data to 

2 evaluate an NED metric, and wherein said characterizing step comprises characterizing the 

3 condition of said region as NED if said metric is satisfied. 

1 23. The method of claim 1 , wherein said processing step comprises applying a statistical 

2 classification technique to determine tissue-class probability. 

1 24. The method of claim 23, wherein said statistical classification technique comprises a 

2 principal component analysis method. 
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1 25. The method of claim 23, wherein said statistical classification technique comprises a 

2 feature coordinate extraction method. 

1 26. The method of claim 1 , wherein said processing step comprises applying a plurality of 

2 statistical classification techniques to determine tissue-class probability. 

1 27. The method of claim 26, wherein said plurality of statistical classification techniques 

2 comprise principal component analysis methods. 

1 28. . The method of claim 26, wherein said plurality of statistical classification techniques 

2 comprise a principal component analysis method and a feature coordinate extraction method. 

1 29. The method of claim 26, wherein said plurality of statistical classification techniques 

2 comprise a DAFE classification method and a DASCO classification method. 

1 30. The method of claim 1 , the method further comprising the steps of using an optical 

2 detection device to obtain spectral data from said region of said tissue sample, and compensating 

3 for a relative motion between said tissue sample and said optical detection device. 

1 31. The method of claim 1 , wherein said characterizing step comprises assigning a tissue- 

2 class probability to said region. 

1 32. The method of claim 3 1 5 wherein said tissue-class probability is a CIN2/3 probability. 

1 33 . The method of claim 1 , further comprising the step of: 

2 (d) displaying tissue-class probabilities of a plurality of regions of said tissue sample. 
1 34. The method of claim 33, wherein said tissue-class probabilities are ON 2/3 probabilities. 

1 35. The method of claim 33, wherein said displaying step comprises displaying said tissue- 

2 class probabilities overlaid onto a reference image comprising said plurality of regions. 

1 36. The method of claim 33, wherein said displaying step is performed in real-time during a 

2 patient examination. 

1 37. The method of claim 33, wherein said displaying step comprises distinguishing regions of 

2 said tissue sample with a high tissue-class probability from regions of said tissue sample with a 

3 low tissue-class probability. 

1 38. The method of claim 37, wherein said tissue-class probability is a CIN 2/3 probability. 

1 39. The method of claim 1, wherein said set of spectral data comprise data of a type selected 

2 from the group consisting of reflectance, fluorescence, Raman, and infrared data. 

1 40. The method of claim 1, wherein said tissue sample comprises cervical tissue. 

1 41 . The method of claim 1, wherein said tissue sample comprises tissue of a type selected 

2 from the group consisting of colorectal tissue, gastroesophageal tissue, urinary bladder tissue, 

3 lung tissue, skin tissue, and epithelial tissue. 
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1 42. An apparatus for characterizing the condition of one or more regions of a tissue sample, 

2 the apparatus comprising: 

3 (a) an optical detection device adapted to obtain spectral data from a plurality of 

4 regions of a tissue sample; 

5 (b) a memory that stores code defining a set of instructions; 

6 (c) a processor that executes said instructions thereby to: 

7 identify spectral data obtained from substantially unobstructed members of said 

8 plurality of regions, wherein said members are within a zone of interest; 

9 determine tissue-class probabilities using said spectral data; and 

10 determine a condition of one or more of said plurality of regions using said tissue- 

11 class probabilities. 

1 43 . The method of claim 42, wherein said optical detection device is adapted to obtain 

2 spectral data and image data from said plurality of regions. 

1 44. The method of claim 43, wherein said processor is adapted to identify said spectral data 

2 using image masking. 

1 45. The method of claim 43, wherein said processor is adapted to identify said spectral data 

2 using image masking and spectral masking. 

1 46. A method of determining the condition of one or more regions of a tissue sample, the 

2 method comprising the steps of: 

3 (a) identifying spectral data obtained from substantially unobstructed regions of a 

4 tissue sample using image data from said regions, wherein said regions are within a zone of 

5 interest; 

6 (b) determining tissue-class probabilities corresponding to each of said substantially 

7 unobstructed regions using said spectral data; and 

8 (c) determining a condition of one or more of said regions using said tissue-class 

9 probabilities. 

1 47. A method of determining a tissue-class probability for a region of tissue, the method 

2 comprising the steps of: 

3 (a) processing a first set of spectral data from a region of tissue to obtain a first measure 

4 of tissue-class probability for said region of tissue, wherein said first set comprises reflectance 

5 spectral data; 

6 (b) processing a second set of spectral data from said region to obtain a second measure 

7 of tissue-class probability for said region; and 
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8 (c) detennining an overall tissue-class probability for said region using said first measure 

9 and said second measure. 

1 48. The method of claim 47, wherein tissue-class probability is a probability that said region 

2 comprises tissue of a predetermined type, wherein said type is selected from the group consisting 

3 of CIN 1, C1N 2, CDST 3, CIN 2/3, normal squamous, normal columnar, necrosis, NED, 

4 metaplasia, and cancer. 

1 49. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises using a statistical method based on maximal variance. 

1 50. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises using a statistical method based on maximal discrimination. 

1 51. The method of claim 47, wherein said first processing step comprises using a statistical 

2 method based on maximal variance and said second processing step comprises using a statistical 

3 method based on maximal discrimination. 

1 52. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises performing a principal component analysis. 

1 53 . The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises performing a feature coordinate extraction. 

1 54. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises performing a discriminant analysis with shrunken covariances. 

1 55. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises performing a discriminant analysis feature extraction. 

1 56. The method of claim 47, wherein said first processing step comprises performing a 

2 discriminant analysis with shrunken covariances and said second processing step comprises 

3 performing a discriminant analysis feature extraction. 

1 57. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises determining a statistical distance. 

1 58. The method of claim 57, wherein said statistical distance is selected from the group 

2 consisting of a Mahalanobis distance, a Bhattacharya distance, a Euclidian distance, and a 

3 Jef&ey-Matsushita distance. 

1 59. The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises determining a statistical distance to feature centers in primary space 

3 and a statistical distance to feature centers in secondary space. 
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1 60, The method of claim 47, wherein at least one of said first processing step and said second 

2 processing step comprises determining a Bayes score. 

1 61. The method of claim 47, wherein said first set and said second set share at least one 

2 member. 

1 62. The method of claim 47, wherein said first set and said second set are identical. 

1 63 . The method of claim 47, wherein said first set and said second set comprise reflectance 

2 spectral data, 

1 64. The method of claim 47, wherein at least one of said first set and said second set 

2 comprises fluorescence spectral data. 

1 65 . The method of claim 47, wherein at least one of said first set and said second set 

2 comprises data corresponding to wavelengths between about 370 nm and about 650 nm. 

1 66. The method of claim 47, wherein said first set of spectral data consists of data 

2 corresponding to wavelengths between about 400 nm and about 600 nm. 

1 67. The method of claim 47, wherein said second set of spectral data consists of data 

2 corresponding to wavelengths between about 370 nm and about 650 nm. 

1 68. The method of claim 47, wherein at least one of said first set and said second set 

2 comprises preprocessed spectral data. 

1 69. The method of claim 68, wherein said preprocessed spectral data comprise data that are 

2 filtered to remove members that are non-representative of said region. 

1 70. A method of determining the condition of a region of tissue, the method comprising: 

2 (a) for each of a plurality of predefined tissue classes, processing reflectance spectral data 

3 obtained from a region of tissue to determine a first and a second measure of probability that said 

4 region comprises tissue within said class; and 

5 (b) determining a condition of said region using said first and said second measures. 

1 71. The method of claim 70, wherein said condition is selected from the group consisting of 

2 CIN 2/3, NED, indeterminate, and necrotic. 

1 72. Hie method of claim 70, wherein one or more members of said plurality of predefined 

2 tissue classes are selected from the group consisting of CIN 1, CIN 2, CIN 3, CIN 2/3, NED, 

3 normal squamous, normal columnar, metaplasia, and cancer. 

1 73 . The method of claim 70, wherein said first processing step comprises using a principal 

2 component analysis method to determine said first measure of probability and a feature 

3 coordinate extraction method to determine said second measure of probability. 
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1 74. The method of claim 70, wherein said first processing step comprises comparing spectral 

2 data obtained from said region with two or more sets of training data. 

1 75. The method of claim 70, wherein said second processing step comprises determining an 

2 overall probability that said region comprises tissue within said class, using said first and said 

3 second measures. 

1 76. The method of claim 75, wherein said overall probability is weighted according to a 

2 likelihood that said region lies within a zone of interest. 

1 77. The method of claim 75, wherein said overall probability is weighted according to a 

2 likelihood that spectral data obtained from said region are affected by an obstruction. 

1 78. A method of characterizing the condition of a region of tissue, the method comprising the 

2 steps of: 

3 (a) processing spectral data obtained from a region of tissue to determine, for each 

4 member of a plurality of predefined tissue classes, a probability that said region comprises tissue 

5 within said member; 

6 (b) evaluating a classification metric using spectral data obtained from said region; 

7 (c) if said classification metric is satisfied, characterizing a condition of said region 

8 according to said classification metric; and 

9 (d) if said classification metric is not satisfied, characterizing a condition of said 

10 region according to said probabilities. 

1 79. The method of claim 78, wherein said evaluating step comprises using fluorescence 

2 spectral data. 

1 80. The method of claim 78, wherein said processing step comprises processing reflectance 

2 spectral data. 

1 81 . The method of claim 78, wherein said evaluating step comprises using fluorescence 

2 spectral data and said processing step comprises processing reflectance spectral data. 

1 82. The method of claim 78, wherein said processing step comprises applying one or more 

2 statistical methods to a set of reflectance spectral data obtained from said tissue. 

1 83 . The method of claim 78, wherein said classification metric comprises a non-statistically- 

2 based component. 

1 84. The method of claim 83, wherein said non-statistically-based component is indicative of 

2 a substance present in tissue within at least one of said predefined tissue classes. 

1 85. The method of claim 84, wherein said substance is selected from the group consisting of 

2 collagen, porphyrin, FAD, and NADH. 
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1 86. The method of claim 78, wherein said classification metric comprises one or more 

2 statistically-based components and one or more non-statistically-based components. 

1 87. A method of using a spectral mask to process spectral data, the method comprising the 

2 steps of: 

3 (a) applying at least one spectral mask to identify a subset of spectral data obtained from 

4 a plurality of regions of a tissue sample, wherein said subset comprises data that are non- 

5 representative of a zone of interest of said tissue sample; 

6 (b) identifying one or more regions of said tissue sample from which said subset was 

7 obtained; and 

8 (c) processing spectral data obtained from said plurality of regions in a tissue 

9 classification scheme. 

1 88. The method of claim 87, wherein said subset consists of data that are non-representative 

2 of a zone of interest of said tissue sample. 

1 89. The method of claim 87, wherein said processing step comprises disqualifying said 

2 subset of spectral data from use in said tissue classification scheme. 

1 90. The method of claim 87, wherein said processing step comprises classifying said one or 

2 more identified regions as indeterminate. 

1 91. The method of claim 87, wherein said processing step comprises determining an overlap 

2 between said one or more regions and an area identified by one or more image masks. 

1 92. The method of claim 91, further comprising the step of classifying at least a portion of 

2 said overlap as indeterminate. 

1 93 . The method of claim 9 1 , further comprising the step of weighting spectral data from said 

2 overlap in said tissue classification scheme. 

1 94. The method of claim 91, wherein said one or more image masks comprise a member 

2 selected from the group consisting of a smoke tube mask, a speculum mask, a region-of-interest 

3 mask, and a vaginal wall mask. 

1 95. The method of claim 87, wherein said spectral data comprises fluorescence spectral data 

2 and reflectance spectral data. 

1 96. The method of claim 87, wherein said at least one spectral mask comprises a cervical 

2 edge mask. 

1 97. The method of claim 96, wherein said cervical edge mask is based at least in part on a 

2 ratio of a first reflectance intensity at a first wavelength to a second reflectance intensity at a 

3 second wavelength. 
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1 98. The method of claim 97, wherein said first wavelength is about 700 nm and said second 

2 wavelength is about 540 nm. 

1 99. The method of claim 96, wherein said cervical edge mask is based at least in part on a 

2 ratio of a first fluorescence intensity at a first wavelength to a second fluorescence intensity at a 

3 second wavelength. 

1 100. The method of claim 99, wherein said first wavelength is about 530 nm and said second 

2 wavelength is about 410 nm. 
101. The method of claim 96, wherein said applying step comprises comparing the ratio F(530 



1 



2 nm)/F(410 nm) with a threshold of about 4.75. 



1 



1 



l 



l 



102. The method of claim 96, wherein said cervical edge mask is based at least in part on (i) a 



2 ratio of a first reflectance intensity to a second reflectance intensity and (ii) a ratio of a first 

3 fluorescence intensity to a second fluorescence intensity. 

1 103. The method of claim 96, wherein said cervical edge mask comprises the metric 

2 BB(450 nm)-BB(700 nm)/BB(540 nm) < 0.30 OR F(530 nm)/F(410 nm) > 4.75. 

1 104. The method of claim 87, wherein said at least one spectral mask comprises a mucus 

2 mask. 

1 105. The method of claim 104, wherein said mucus mask is based at least in part on a ratio of 

2 a first reflectance intensity at a first wavelength to a second reflectance intensity at a second 

3 wavelength. 



106. The method of claim 105, wherein said first wavelength and said second wavelength 



2 maximize a discrimination function comprising the ratio. 



107. The method of claim 105, wherein said first wavelength is about 456 nm and said second 



2 wavelength is about 542 nm. 



108. The method of claim 1 04, wherein said applying step comprises comparing the ratio 



2 BB(456 nm)/BB(542 nm) with a threshold of about 1.1. 

1 109. The method of claim 104, wherein said applying step comprises comparing the ratio 

2 BB(594 nm)/BB(610 nm) with a threshold of about 0.74. 

1 110. The method of claim 1 04, wherein said mucus mask comprises the metric 

2 BB(456 nm)/BB(542 nm) < 1 .06 OR BB(594 nm)/BB(610 nm) > 0.74. 

1 111. A method of identifying a region of healthy tissue, the method comprising the steps of: 

2 (a) determining a first ratio of members selected from a set of spectral data corresponding 

3 to a region of tissue; 

4 (b) determining a second ratio of members selected from said set of spectral data; and 
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5 (c) evaluating a metric based at least in part on said first ratio and said second ratio to 

6 determine whether said region of tissue is healthy tissue. 

1 112. The method of claim 111, wherein said first ratio is a ratio of a first fluorescence intensity 

2 at a first wavelength to a second fluorescence intensity at a second wavelength. 

1 113. The method of claim 112, wherein said second ratio is aratio of a third fluorescence 

2 intensity at a third wavelength to a reflectance intensity at said third wavelength. 

1 1 14. The method of claim 1 13, wherein said third wavelength is about 430 nm. 

1 115. The method of claim 114, wherein said evaluating step comprises comparing said second 

2 ratio with a threshold of about 600 ct/pJ, where the mean fluorescence intensity of normal 

3 squamous tissue is about 70 ct/pJ at about 450 nm. 

1 116. The method of claim 1 12, wherein said first wavelength is about 450 nm and said second 

2 wavelength is about 566 nm. 

1 117. The method of claim 1 1 6, wherein said evaluating step comprises comparing said first 

2 ratio wife a threshold of about 4.1. 

1 118. The method of claim 1 12, wherein said first wavelength and said second wavelength are 

2 chosen to maximize a fluorescence intensity difference due to a collagen peak. 

1 119. The method of claim 111, wherein said metric comprises 

2 F(430 nm)/BB(430 nm) > 600 ct/^iJ ORF(450 nm)/F(566 nm) > 4.1 OR F(460) > { 1 15-F(505 

3 nm)/F(410 nm) - 40}, where the mean fluorescence intensity of normal squamous tissue is about 

4 70 ct/[J at about 450 nm. 

1 120. The method of claim 111, further comprising the step of filtering said set of spectral data 

2 using a necrosis mask. 

1 121. The method of claim 1 1 1, further comprising the step of filtering said set of spectral data 

2 using at least one image mask. 

1 122. The method of claim 111, wherein said image mask is selected from the group consisting 

2 of a region-of-interest mask, a smoke tube mask, and a speculum mask. 

1 123 . A method of identifying a region of necrotic tissue, the method comprising the steps of: 

2 (a) determining a first fluorescence intensity from a set of spectral data corresponding to 

3 a region of tissue; and 

4 (b) evaluating a metric based at least in part on said first fluorescence intensity in order to 
. 5 determine whether said region of tissue is necrotic tissue. 

1 124. The method of claim 123, wherein said first fluorescence intensity is indicative of a 

2 porphyrin peak. 
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1 125. The method of claim 123, wherein said evaluating step comprises determining whether 

2 said first fluorescence intensity exceeds a first threshold. 

1 126. The method of claim 125, wherein said first fluorescence intensity corresponds to a 

2 wavelength of about 635 nm. 

1 127. The method of claim 126, wherein said first threshold is about 20 ct/|iJ, where the mean 

2 fluorescence intensity of normal squamous tissue is about 70 ct/\xJ at about 450 nm. 

1 128. The method of claim 125, further comprising the step of determining a second 

2 fluorescence intensity and a third fluorescence intensity, and wherein said evaluating step 

3 comprises determining whether a ratio of said second fluorescence intensity and said third 

4 fluorescence intensity exceeds a second threshold. 

1 129. The method of claim 128, wherein said ratio of said second fluorescence intensity and 

2 said third fluorescence intensity is indicative of an FAD/NADH ratio. 

1 130. The method of claim 128, wherein said second fluorescence intensity corresponds to a 

2 wavelength of about 5 1 0 nm, and wherein said third fluorescence intensity corresponds to a 

3 wavelength of about 450 nm. 

1 131. The method of claim 128, wherein said second threshold is about 1.0. 

1 1 32. The method of claim 1 23, wherein said metric comprises 

2 F(510nm)/F(450 nm) > 1.0 AND F(635 nm)/F(605 nm) > 1.3 AND F(635 nm)/F(660 nm) > 1.3 

3 AND F(635 nm) > 20 ct/|iJ, where the mean fluorescence intensity of normal squamous tissue is 

4 about 70 ct/jj at about 450 nm. 

1 133. The method of claim 123, further comprising the step of filtering said set of spectral data 

2 using at least one image mask. 

1 134. The method of claim 123, wherein said image mask is selected from the group consisting 

2 of a smoke tube mask and a speculum mask. 

1 135. A method of using an image mask to process optical data, the method comprising the 

2 steps of: 

3 (a) providing image data from an area of a tissue sample; 

4 (b) identifying a subset of said image data using at least one image mask; 

5 (c) identifying one or more regions of said tissue sample from which said subset was 

6 obtained; and 

7 (d) processing optical data from said one or more regions. 

1 136. The method of claim 135, wherein said optical data is spectral data. 
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1 137. The method of claim 135, wherein said processing step comprises filtering spectral data 

2 for use in a tissue classification scheme. 

1 138. The method of claim 137, wherein said processing step comprises disqualifying data 

2 corresponding to the one or more regions identified in step (c) from use in said tissue 

3 classification scheme. 

1 139. The method of claim 1 37, wherein said processing step comprises classifying the one or 

2 more regions identified in step (c) as indeterminate. 

1 140. The method of claim 137, wherein said tissue classification scheme comprises a principal 

2 component analysis method. 

1 141 . The method of claim 137, wherein said tissue classification scheme comprises a feature 

2 coordinate extraction method. 

1 142. The method of claim 1 37, wherein said tissue classification scheme comprises a principal 

2 component analysis method and a feature coordinate extraction method. 

1 143. The method of claim 135, wherein said processing step comprises determining a percent 

2 mask coverage for each of the one or more regions identified in step (c). 

1 144. The method of claim 143, wherein said processing step comprises applying a weighting 

2 factor according to said percent mask coverage. 

1 145. The method of claim 135, wherein said at least one image mask comprises a binary 

2 image mask. 

1 146. The method of claim 135, wherein said at least one image mask identifies a set of pixels. 

1 147. The method of claim 135, wherein said at least one image mask comprises an obstruction 

2 mask. 

1 148. The method of claim 147, wherein said obstruction mask is selected from the group 

2 consisting of a blood mask, a mucus mask, a speculum mask, and a pooled fluid and foam mask. 

1 149. The method of claim 135, wherein said first identifying step comprises thresholding an 

2 initial mask and performing a binary component analysis. 

i 150. The method of claim 135, wherein said at least one image mask comprises a glare mask. 

1 151. The method of claim 1 50, wherein said first identifying step comprises dividing an image 

2 into a plurality of blocks, determining a histogram corresponding to each of the blocks, and 

3 computing one or more thresholds for each of the blocks based on its corresponding histogram. 

1 152. The method of claim 135, wherein said at least one image mask comprises at least one of 

2 the group consisting of an os mask, a smoke tube mask, a vaginal wall mask, and a region-of- 

3 interest mask. 
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1 153. The method of claim 1 52, wherein said first identifying step comprises determining a 

2 gradient image, using said gradient image to determine a skeletonized image, and performing 

3 edge linking and edge extension using said skeletonized image. 

1 1 54. The method of claim 1 52, wherein said first identifying step comprises thresholding a red 

2 channel component of said image data. 

1 155. The method of claim 135, wherein said at least one image mask comprises at least three 

2 of the group consisting of a blood mask, a mucus mask, a speculum mask, a pooled fluid and 

3 foam mask, a glare mask, an os mask, a smoke tube mask, a vaginal wall mask, and a region-of- 

4 interest mask. 

1 1 56. The method of claim 135, wherein said at least one image mask comprises at least six of 

2 the group consisting of a blood mask, a mucus mask, a speculum mask, a pooled fluid and foam 

3 mask, a glare mask, an os mask, a smoke tube mask, a vaginal wall mask, and a region-of- 

4 interest mask. 

1 157. The method of claim 135, wherein said at least one image mask comprises the group 

2 consisting of a blood mask, a mucus mask, a speculum mask, a pooled fluid and foam mask, a 

3 glare mask, an os mask, a smoke tube mask, a vaginal wall mask, and a region-of-interest mask. 

1 158. A method of displaying diagnostic data, the method comprising the steps of: 

2 (a) providing a reference image of a tissue sample; 

3 (b) providing a tissue-class probability corresponding to each member of a plurality of 

4 regions of said tissue sample; 

5 (c) creating an overlay comprising colors as a proxy for said tissue-class probabilities; 

6 and 

7 (d) displaying said reference image with said overlay. 

1 159. The method of claim 158, wherein tissue-class probability is a probability that a region 

2 comprises tissue of a predetermined type, wherein said type is selected from the group consisting 

3 of CIN 1, CIN 2, CIN 3, CIN 2/3, metaplasia, NED, and cancer. 

1 1 60. The method of claim 158, wherein said creating step comprises assigning grayscale 

2 luminance values as said proxy for said tissue-class probabilities. 

1 161. The method of claim 158, wherein said creating step comprises assigning RGB color 

2 values as said proxy for said tissue-class probabilities. 

1 1 62 . The method of claim 158, wherein said creating step comprises assigning grayscale 

2 luminance values to said tissue-class probabilities and converting said luminance values to RGB 

3 color values. 
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1 163, The method of claim 158, wherein said colors are blended to provide diagnostically 

2 relevant information. 

1 164. The method of claim 158, wherein said creating step comprises assigning grayscale 

2 luminance values to said tissue-class probabilities, spatially filtering said grayscale luminance 

3 values, and converting said filtered grayscale luminance values to RGB color values. 

1 165. The method of claim 158, wherein said creating step comprises spatially filtering values 

2 of said tissue-class probabilities, assigning grayscale luminance values to said filtered probability 

3 values, and converting said grayscale luminance values to RGB color values. 

1 166. The method of claim 158, wherein at least one of said colors is yellow. 

l 167. The method of claim 158, wherein at least one of said colors is blue. 

1 168. The method of claim 158, wherein said colors comprise a continuum from yellow to blue. 

1 169. The method of claim 1 68, wherein said continuum varies from an average tissue color to 

2 a first reference color. 

1 170. The method of claim 158, wherein said overlay identifies at least one indeterminate 

2 region of said tissue sample. 

1 171 . The method of claim 170, wherein said overlay identifies an indeterminate region without 

2 obscuring a corresponding portion of said reference image. 

1 172. The method of claim 170, wherein said indeterminate region is identified using a 

2 Crosshatch pattern or a trellis pattern. 

1 173. The method of claim 158, wherein said overlay identifies at least one necrotic region of 

2 said tissue sample. 

1 174. The method of claim 173, wherein said overlay identifies a necrotic region without . 

2 obscuring a corresponding portion of said reference image. 

1 175. The method of claim 173, wherein said necrotic region is identified using a Crosshatch 

2 pattern or a trellis pattern. 

1 176. The method of claim 158, wherein said displaying step is performed in real time during a 

2 patient examination. 

1 177. The method of claim 158, wherein said displaying step is performed within about an hour 

2 of a patient examination. 

1 178. A method of displaying diagnostic data, the method comprising the steps of: 

2 (a) providing a reference image of a tissue sample; 

3 (b) providing a tissue-class probability corresponding to each member of a plurality of 

4 regions of said tissue sample; 
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5 (c) creating an overlay comprising colors as a proxy for said tissue-class probabilities, 

6 wherein said colors are blended to provide diagnostically relevant information; and 

7 (d) displaying said reference image with said overlay. 

1 179. The method of claim 178, wherein said creating step comprises assigning grayscale 

2 luminance values to said tissue-class probabilities, spatially filtering said grayscale luminance 

3 values, and converting said filtered grayscale luminance values to RGB color values. 

1 1 80. A method of creating an overlay for displaying diagnostic data, the method comprising 

2 the steps of: 

3 (a) providing a tissue-class probability corresponding to each member of a plurality of 

4 regions of a tissue sample; and 

5 (b) creating an overlay comprising colors as a proxy for said tissue-class probabilities, 

6 wherein said colors are blended to provide diagnostically relevant information. 

1 181. A method of calibrating spectral data obtained from a tissue sample, the method 

2 comprising the steps of: 

3 (a) obtaining calibration data from a plurality of spaced-apart locations on a calibration 

4 target; 

5 (b) obtaining a set of spectral data from spaced-apart locations of a tissue sample, 

6 wherein at least some of said spaced-apart locations of said tissue sample correspond to said 

7 spaced-apart locations on said calibration target; and 

8 (c) calibrating said spectral data obtained from said tissue sample using said calibration 

9 data, thereby to produce calibrated data. 

1 1 82. The method of claim 181, wherein said calibration data comprises reflectance spectral 

2 data. 

1 1 83 . The method of claim 181, wherein said calibration data comprises fluorescence spectral 

2 data. 

1 1 84. The method of claim 181, wherein said first obtaining step comprises using an optical 

2 instrument to obtain said calibration data as part of an initial calibration of said optical 

3 instrument. 

1 185. The method of claim 181, wherein said first obtaining step comprises using an optical 

2 instrument to obtain said calibration data as part of a periodic calibration of said optical 

3 instrument. 

1 186. The method of claim 181, wherein said calibration target comprises a fluorescent dye. 

1 1 87. The method of claim 186, wherein said calibration target comprises coumarin-5 1 5 dye. 
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1 188. The method of claim 181, further comprising the step of: 

2 obtaining measures of instrument response using a reference light source. 

1 189. The method of claim 188, wherein said reference light source comprises a filament that 

2 approximates a blackbody emitter. 

1 190. The method of claim 1 89, wherein said filament is a tungsten filament. 

1 191. The method of claim 181, further comprising the steps of: 

2 obtaining a mercury spectrum and an argon spectrum; and 

3 converting a CCD pixel index to a wavelength using data from said mercury spectrum 

4 and said argon spectrum. 

1 1 92. The method of claim 181, further comprising the step of: 

2 ' processing said calibrated data in a tissue classification algorithm. 

1 1 93 . A method of calibrating spectral data obtained from a tissue sample, the method 

2 comprising the steps of: 

3 (a) obtaining a first set of calibration data from a plurality of spaced-apart locations on a 

4 first calibration target; 

5 (b) obtaining a second set of calibration data from a plurality of spaced-apart locations on 

6 a second calibration target; 

7 (c) obtaining a set of spectral data from spaced-apart locations of a tissue sample, 

8 wherein at least some of said spaced-apart locations of said tissue sample correspond to said 

9 spaced-apart locations on said first calibration target and said spaced-apart locations on said 

10 second calibration target; and 

1 1 (d) calibrating said spectral data obtained from said tissue sample using said first set of 

12 calibration data and said second set of calibration data, thereby to produce calibrated data. 

1 1 94. The method of claim 1 93, wherein said second calibration target is a single-use 

2 disposable target. 

1 195. The method of claim 1 93 , wherein said first calibration target has a reflectance of about 

2 60% and said second calibration target has a reflectance of about 10%. 

1 196. The method of claim 1 93 , wherein said second set of calibration data is obtained within 

2 about 24 hours of obtaining said set of spectral data from said tissue sample. 

1 197. The method of claim 193, wherein said second set of calibration data is obtained within 

2 about 1 hour of obtaining said set of spectral data from said tissue sample. 

1 198. The method of claim 193, wherein said calibrating step comprises processing said 

2 spectral data according to the equation: 
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3 R(i,X,t') = [IniW) / <Icp(i^t'))i ] • [<I fc (iM)>i / Ifc(iM)] ■ Rep, 

4 wherein R(i,X,t 5 ) is an array comprising calibrated reflectance spectral data from said tissue 

5 sample at regions i, wavelengths X, and at time t\ I m (i,k,t') is an array comprising reflectance 

6 spectral data from said tissue sample, If C (i,Mo) is an array comprising said first set of calibration 

7 data obtained at time to different from t\ <If C (i 5 Mo))i is an array comprising said first set of 

8 calibration data averaged over said i regions, (I C p(i,^t')>i is an array comprising said second set 

9 of calibration data averaged over said i regions, and Rc P is the reflectance of said second 

10 calibration target. 

1 199. The method of claim 198, wherein data in array I m (i,X,t') are background subtracted. 

1 200. The method of claim 1 98, wherein data in at least one of arrays I C p(i,^,t') and If C (i,Mo) are 

2 background subtracted. 

1 201. The method of claim 193, further comprising the step of: 

2 obtaining a third set of calibration data using said second calibration target, wherein said 

3 calibration step comprises calibrating said spectral data obtained from said tissue sample using 

4 said first set of calibration data, said second set of calibration data, and said third set of 

5 calibration data. 

1 202. The method of claim 201 , wherein said calibrating step comprises processing said 

2 spectral data according to the equation: 

3 R&M*) ~ Pm W) / Ctp W)>i ] ' [afc(i>Mo)>i / IfcCMo)] ' Rep, fitted, 

4 wherein R(i,X,t') is an array comprising calibrated reflectance spectral data from said tissue 

5 sample at regions i, wavelengths and at time t' , I m (i,X,t') is an array comprising reflectance 

6 spectral data from said tissue sample, If C (i,X,t 0 ) is an array comprising said first set of calibration 

7 data obtained at time t 0 different from t\ <I fc (U,t 0 ))i is an array comprising said first set of 

8 calibration data averaged over said i regions, <I cp (iXt , )>i is an array comprising said second set 

9 of calibration data averaged over said i regions, Repotted is an array of values of a curve fit of 

10 Rcp(X), where R^ft) = [ <I cp (i,U>)>i / (Ifc(i,M 0 ))i]-Rf Cj and where <I cp (i,Mo)>i is an array 

1 1 comprising said third set of calibration data obtained at time t 0 and averaged over said i regions, 

12 and Rf C is the reflectance of said first calibration target. 

1 203 . A method of calibrating spectral data obtained from a tissue sample, the method 

2 comprising the steps of: 

3 (a) obtaining calibration data from a plurality of spaced-apart locations on a calibration 

4 target using an optical instrument with a first attached disposable component; 
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5 (b) obtaining a set of spectral data from spaced-apart locations of a tissue sample, 

6 wherein at least some of said spaced-apart locations of said tissue sample correspond to said 

7 spaced-apart locations on said calibration target; and 

8 (c) calibrating said spectral data obtained from said tissue sample using said calibration 



9 data, thereby to produce calibrated data. 

1 204. The method of claim 203, wherein said disposable component is a protective sheath. 

1 205 . The method of claim 203, wherein said set of spectral data from said tissue sample is 

2 obtained using said optical instrument with a second attached disposable component in place of 

3 said first disposable component. 

1 206. The method of claim 203, further comprising the step of: 

2 obtaining an additional set of calibration data from a plurality of spaced-apart locations 

3 on an additional calibration target using said optical instrument with a second attached 

4 disposable component in place of said first disposable component, wherein said calibrating step 

5 comprises calibrating said spectral data obtained from said tissue sample using said calibration 

6 data and said additional calibration data. 

1 207 . The method of claim 206, wherein said set of spectral data from said tissue sample is 

2 obtained using said optical instrument with said second attached disposable component. 

1 208 . A method of correcting spectral data from a tissue sample for stray light internal to an 

2 optical instrument, the method comprising the steps of: 



3 (a) obtaining a first set of spectral data using a target and using a light source internal to 

4 an optical instrument, wherein said instrument yields a residual optical signal; 

5 (b) obtaining a second set of spectral data using a light source internal to said optical 

6 instrument, with no external light source; 

7 (c) obtaining a third set of spectral data from a tissue sample; and 

8 (d) adjusting said third set of data using a subset of said first set of data and a subset of 

9 said second set of data. 

1 209. The method of claim 208, wherein said second obtaining step comprises obtaining said 

2 second set of spectral data without a target. 

1 210. The method of claim 208, wherein said target has substantially no diffuse reflectance. 

1 211. The method of claim 208, further comprising the step of: 

2 obtaining a fourth set of spectral data from a target yielding substantially no optical 

3 signal, wherein said adjusting step comprises adjusting said third set of data using a subset of 

4 said first set of data, a subset of said second set of data, and a subset of said fourth set of data. 
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1 212. The method of claim 211, wherein the step of obtaining the fourth set of spectral data is 

2 performed within about one hour of obtaining said third set of spectral data. 

1 213. A method of focusing an optical instrument on a tissue sample, the method comprising 

2 the steps of: 

3 (a) projecting a plurality of light spots onto a tissue sample; 

4 (b) superimposing a plurality of focusing elements in a visual field comprising said tissue 

5 sample; and 

6 (c) aligning a subset of said light spots substantially within said focusing elements. 

1 214. The method of claim 213, wherein said projecting step comprises projecting a plurality of 

2 laser beams toward said tissue sample. 

1 215. The method of claim 214, wherein each member of said plurality of laser beams strikes 

2 said tissue sample at a different angle. 

1 216. The method of claim 214, wherein each member of said plurality of laser beams strikes 

2 said tissue sample at a fixed angle with respect to an objective axis. 

1 217. The method of claim 213, wherein said proj ecting step comprises proj ecting four light 

2 spots. 

1 218. The method of claim 213, wherein said focusing elements are rings. 

1 219. The method of claim 213, wherein said superimposing step comprises superimposing 

2 said plurality of focusing elements in a sequence of images of said tissue. 

1 220. The method of claim 2 1 9, wherein said sequence of images comprises real-time video 

2 images. 

1 221. The method of claim 213, wherein said superimposing step comprises displaying said 

2 plurality of focusing rings in a viewfinder. 

1 222. The method of claim 213, wherein said tissue sample comprises insitu tissue. 

1 223. The method of claim 213, wherein said aligning step comprises adjusting a component of 

2 an optical instrument used to visualize said tissue. 

1 224. The method of claim 223, wherein said component is a probe. 

1 225. A method of focusing an optical instrument on a tissue sample, the method comprising 

2 the steps of: 

3 (a) projecting a plurality of light spots onto a tissue sample; 

4 (b) superimposing a plurality of focusing elements in a visual field comprising said tissue 

5 sample; 

6 (c) aligning a subset of said light spots substantially within said focusing elements; and 
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7 (d) automatically validating an alignment of said subset of light spots within said 

8 focusing elements. 

1 226. The method of claim 225, wherein said projecting step comprises projecting a plurality of 

2 laser beams toward said tissue sample. 

1 227. The method of claim 225, wherein said validating step comprises detecting locations of 

2 said subset of light spots. 

1 228. The method of claim 227, wherein said validating step comprises using a measure of 

2 greenness to detect said locations. 

1 229. The method of claim 228, wherein said measure of greenness is expressed as Ge = G - R 

2 - 1 5, where Ge is the measure of greenness, G is a green channel value, and R is a red channel 

3 value. 

1 230. The method of claim 227, wherein said validating step comprises using a measure of 

2 blueness to detect said locations. 

1 23 1 . The method of claim 227, wherein said validating step comprises using a measure of a 

2 color corresponding to a color of said light spots to detect said locations. 

1 232. The method of claim 227, wherein an image of said tissue is enhanced to increase 

2 contrast between said light spots and surrounding tissue. 

1 233. The method of claim 227, wherein said validating step comprises comparing said 

2 locations with predetermined positions. 

1 234. The method of claim 227, wherein said validating step comprises applying a decision rule 

2 based at least in part on said locations. 

1 235 . The method of claim 234, wherein said decision rule is based at least in part on a number 

2 of light spots detected. 

1 236. The method of claim 225, wherein said validating step comprises performing iterative 

2 dynamic thresholding. 

1 237. The method of claim 236, wherein said validating step comprises performing 

2 morphological processing between thresholding iterations. 

1 238. The method of claim 225, further comprising the step of: 

2 (e) obtaining diagnostic optical data from said tissue sample after said validating step, 

3 wherein said obtaining occurs within an optimal data acquisition window. 

1 239. The method of claim 23 8, wherein said optimal data acquisition window is a period of 

2 time beginning about 30 seconds after an application of a contrast agent to said tissue sample and 

3 ending about 130 seconds after said application. 
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1 240. A method of enhancing an image of a tissue sample, the method comprising the steps of: 

2 (a) providing input luminance values from an image of a tissue sample; 

3 (b) filtering said input luminance values using one or more image masks; 

4 (c) transforming said filtered input luminance values to obtain output luminance values; 

5 and { 

6 (d) producing an enhanced image of said tissue sample using said output luminance 

7 values. 

1 241 . The method of claim 240, wherein said filtering step comprises removing input 

2 luminance values corresponding to an area outside a region of interest of said image. 

1 242. The method of claim 240, wherein said one or more image masks comprises a mask 

2 selected from the group consisting of a region-of-interest mask, a glare mask, a speculum mask, 

3 an os mask, a blood mask, a mucus mask, and a smoke tube mask. 

1 243. The method of claim 240, wherein said transforming step comprises using a piecewise 

2 linear transformation. 

1 244. The method of claim 240, wherein said transforming step comprises using one or more 

2 parameters determined from a histogram of said filtered input luminance values. 

1 245. The method of claim 244, wherein said parameters comprise two piecewise linear 

2 breakpoints corresponding to said filtered input luminance values. 

1 246. The method of claim 240, wherein said transforming step comprises using the equation: 

<*P> L n6n ^JU<Ma 
V = ifiOi-Ma) + Va> Ma * M < Mb 

y(M-Mb) + Vi» Mb ^M<L max 

3 where L max is a maximum from said filtered input luminance values, L m j n is a minimum from 

4 said filtered input luminance values, )i a and fib are piecewise linear breakpoints corresponding to 

5 said filtered input luminance values, v a and Vb are piecewise linear breakpoints corresponding to 

6 said output luminance values, and a, p, and y are slopes of said piecewise linear transformation. 

1 247. A method of enhancing an image of a tissue sample, the method comprising the steps of: 

2 (a) providing input data from an image of a tissue sample, said input data comprising 

3 luminance values; 

4 (b) filtering said input data to remove luminance values corresponding to an area outside 

5 a region of interest of said image; 

6 (c) transforming said filtered input data to obtain output data; 
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8 



7 



(d) spatially filtering said output data to produce contrast-enhanced output data; and 

(e) producing an enhanced image of said tissue sample using said contrast-enhanced data. 



1 248. The method of claim 247, further comprising the step of applying a correction to said 

2 contrast-enhanced output data to produce color-balanced, contrast-enhanced output data, and 

3 wherein said producing step comprises using said color-balanced, contrast-enhanced data to 

4 produce said enhanced image. 

1 249. The method of claim 247, wherein said transforming step comprises using a piecewise 

2 linear transformation. 

1 250. The method of claim 247, wherein said transforming step comprises using one or more 

2 parameters determined from a histogram of said filtered input luminance values. 

1 251. The method of claim 250, wherein said parameters comprise two piecewise linear 

2 breakpoints corresponding to said filtered input luminance values. 

1 252. The method of claim 247, wherein said transforming step comprises using the equation: 



3 where L mQX is a maximum from said filtered input luminance values, L m i n is a minimum from 

4 said filtered input luminance values, ^ a and fi b are piecewise linear breakpoints corresponding to 

5 said filtered input luminance values, v a and Vb are piecewise linear breakpoints corresponding to 

6 said output luminance values, and a, P, and y are slopes of said piecewise linear transformation. 



2 



<W> ^min * M < Ma 

V = < j3(jU~Ma) + V a> Ma ~M<Mb 

r(M-M b ) + v 0 > M b ^M<L an 
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